out_H = torch.bmm(proj_value_H, att_H.permute(0, 2, 1)).cpu().view(m_batchsize,width,-1,height).permute(0,2,3,1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_bmm)
How can solve this?