Thanks for the code! I am trying to run the training on 8 GPUs but it seems that the code uses only one of them. Is it required to make some changes?