-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Hi, I can run the code with only one GPU. However, errors exist when I use the multi-gpu distributed training. The errors are listed as follows:
raceback (most recent call last):
File "train_spatial_query.py", line 538, in
train(args, loader, generator, discriminator, g_optim, d_optim, g_ema, device, tensorboard_writer, args.exp_name)
File "train_spatial_query.py", line 235, in train
fake_img, latents, mean_path_length
File "train_spatial_query.py", line 97, in g_path_regularize
grad, = autograd.grad(outputs=tmp, inputs=latents, create_graph=True)
File "/home/nrr/.conda/envs/stylegan/lib/python3.7/site-packages/torch/autograd/init.py", line 236, in grad
inputs, allow_unused, accumulate_grad=False)
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
I use four GTX3080 to train the model. The pytorch version is 1.10.2. Could you kindly help me solve the problem. Thanks.