Skip to content

trying to annotate new DLC-ed files with prototypes from model #73

@kipkeller

Description

@kipkeller

In order to label new videos with the discovered prototypes, I am trying to do this:
Recommended Approach: Train a LISBET Classifier on Prototypes

Prepare a Labeled Dataset with Prototype Annotations:
Using the example Python snippet to patch the MY NEW DLC-labeled dataset (in directory: A1Suppression_0-100) with prototype labels (that are in directory: prototypes), I have tried to adapt the suggested python code:

import numpy as np
import pandas as pd
import xarray as xr

from lisbet.datasets import dump_records, load_records

def extract_labels(csv_path):
df = pd.read_csv(csv_path, index_col=0)

# Rows that already have at least one positive label
covered = df.eq(1).any(axis=1)

# Create / update the fallback class
df["Other"] = (~covered).astype(int)

# Keep only the first 1 in every row
first_mask = df.eq(1).cumsum(axis=1).eq(1)

# Apply the mask – everything that isn’t the first 1 becomes 0
df &= first_mask

return df.values

def patch_dataset():
records = load_records(
data_format="DLC",
data_path="A1suppression_0-100",
data_scale="None",
data_filter="train",
)["main_records"]

patched_records = []
for key, data in records:
    posetracks = data["posetracks"].unstack("features")

    labels = extract_labels(f"prototypes\{key}\machineAnnotation_hmmbest_6_32.csv")

    assert labels.shape[0] == posetracks.sizes["time"]

    # Convert to xarray Dataset
    annotations = xr.Dataset(
        data_vars=dict(
            label=(
                ["time", "behaviors", "annotators"],
                labels[..., np.newaxis],
            )
        ),
        coords=dict(
            time=posetracks.time,
            behaviors=[f"motif_{motif_id}" for motif_id in range(labels.shape[1])],
            annotators=["LISBET"],
        ),
        attrs=dict(
            source_software=posetracks.source_software,
            ds_type="annotations",
            fps=posetracks.fps,
            time_unit=posetracks.time_unit,
        ),
    )

    patched_record = (
        key,
        {"posetracks": posetracks, "annotations": annotations},
    )

    patched_records.append(patched_record)

dump_records("datasets\proto_A1suppression_0-100", patched_records)

if name == "main":
patch_dataset()

The first error that comes up is: cannot import name 'dump_records' from 'lisbet.datasets'
But I am sure more will follow - because I am not confident I have modified the Python code correctly

Once this is accomplished, I think the next steps are:

to train the classifier:

betman train_model ^
--run_id=proto_classifier ^
--data_format=DLC ^
--data_scale="1x1" ^
--data_filter=train ^
--learning_rate=1e-4 ^
--epochs=10 ^
--load_backbone_weights=models\A1supp01-embedder\weights\weights_last.pt ^
--freeze_backbone_weights ^
--save_history ^
-v ^
dataset\proto_A1supp

and, finally, annotate the new data:

betman annotate_behavior ^
--data_format=DLC ^
--data_scale="None" ^
--data_filter=test ^
-v ^
A1suppression\task1_classic_classification ^
models\proto_classifier\model_config.yml ^
models\proto_classifier\weights\weights_last.pt

There is a lot here I don't understand. Any help would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions