-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Thanks for your greate work! I'd like to ask you a question that have you run your code yet?
Now i'm trying to run your code and i found a bug : in optimize function, you choose torch.nn.functional.smooth_l1_loss to compute the loss, and this funcion will return a variable not an array, which means the loss is a varible, but in your code, you return loss.cpu().data.numpy()[0], so it will report an error.
Besides, i have trained the network for almost 200 episodes, but it seems that it didn't learn anything, the episode_reward is always 0. So i'm puzzled, do you know the reason why or it was just the lack of trainging episodes?
Thanks again!
Metadata
Metadata
Assignees
Labels
No labels