training on vcr task

hi, i notice that when finetuning ssp model on vcr task, the performance drop a lot at each 5000 steps in the first epoch.
before finuetuning, the result for Q2A and QA2R are both more than 74%
step 5000: 67% and 66%
step 10000: 64.7% and 64.9%
step 15000: 61.3% and 60.9%
step 20000: 60.8% and 57.9%
and the other results have no been gained. still training.

is it true? is the model trained in a right way? have you notice this phenomenon when you finetune it?
of course, i have no V100 so i train the model in 4 2080ti. with the limitation of memory, i set batch size=1, test_batch_size=4, gradient_accumulation_steps=8. the other config is the same as your vcr.yaml.

looking for you help, thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training on vcr task #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

training on vcr task #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions