-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
I want to know that this in readme.md:
For example, to run OGM on CREMA-D dataset:
python -m balancemm \
--trainer OGM \
--dataset CREMAD \
--model BaseClassifier \
--alpha 0.5 \
--device 0
The CREMAD dataset is a bimodal dataset, but BaseClassifier is trimodal like:
model:
BaseClassifier:
encoders: {audio: Transformer_, visual: Transformer_, text: Transformer_}
fusion: "concat"
modality_size: 1024
Is there any problem here?
Also, the paper mentions that the FOOD101 dataset uses a model shared by resnet and transformer, but this does not seem to be in model_config.yaml.
Looking forward to your reply.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels