-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
Dears,
Thanks for sharing your great work!
I have two questions regarding the training details:
- It is mentioned in the running script "run_grpo.sh" that we have to determine a certain GPU for the reward. However, I don't see any part of the code that does so. Can you please point out the part that splits the Janus model on X GPUs, then the reward models on a separate GPU?
- The training time: May I know how long it takes to train the model? From the configuration, it seems you have only trained for 1600 steps, isn't it? I am wondering if it is enough to capture new skills using only 1600 steps, while even the batch size is set to 1? I feel like this number is too small; thus, it would be appreciated if you could elaborate more about the training epochs or steps.
Thanks again for sharing your great work!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels