Conversation
| :add(nn.GPU(nn.Linear(10000,10000), 1)) | ||
| :add(nn.GPU(nn.Linear(10000,10000), 2)) | ||
| :add(nn.GPU(nn.Linear(10000,10000), 3)) | ||
| :add(nn.GPU(nn.Linear(10000,10000), 4, cutorch.getDevice())) |
There was a problem hiding this comment.
I am wondering if this line can run in CPU-only environments
There was a problem hiding this comment.
the cutorch.getDevice will not. But it isn't mandatory to use cutorch.getDevice(), this is just an example.
b1d68d6 to
dba06e5
Compare
|
I have tested the nn.GPU implementation on a language model task using 4 GPUs and it works. |
|
This is cool but the name and the order of parameters is .. a little confusing :add(nn.OnGPU(1, nn.Linear(10000,10000))
:add(nn.OnGPU(2, nn.Linear(10000,10000))
:add(nn.OnGPU(3, nn.Linear(10000,10000))
:add(nn.OnGPU(4, cutorch.getDevice(), nn.Linear(10000,10000)))may be clearer than :add(nn.GPU(nn.Linear(10000,10000), 1))
:add(nn.GPU(nn.Linear(10000,10000), 2))
:add(nn.GPU(nn.Linear(10000,10000), 3))
:add(nn.GPU(nn.Linear(10000,10000), 4, cutorch.getDevice()))how do you think? |
|
@iamalbert I prefer nn.GPU, but I like your order. Except for the optional outdevice argument which I think should still be last (as it is optional): :add(nn.GPU(1, nn.Linear(10000,10000))
:add(nn.GPU(2, nn.Linear(10000,10000))
:add(nn.GPU(3, nn.Linear(10000,10000))
:add(nn.GPU(4, nn.Linear(10000,10000)), cutorch.getDevice()) |
|
The order is |
|
@szagoruyko Good point. The argument of backwards compatibility always wins (plus I am lazy). I will leave it as it is. |
|
@nicholas-leonard there is a couple of places like |
|
@szagoruyko how does it break tensor sharing (I don't know anyone that shares outputs). As for optnet, what does it do that cannot support |
|
Hi, Sorry for the delay in replying. The current underlying requirement in
function MySelectModule:updateOutput(input)
self.output = input[1] -- creates a new tensor which shares the storage
return self.output
end
About this PR, as it's a generic module that supports both |
|
So then I guess this PR is ready to merge then :) |
| self.modules[1] = module | ||
|
|
||
| if module:type() == 'torch.CudaTensor' then | ||
| self:cuda() |
There was a problem hiding this comment.
this :cuda() is no executing in the context of "device". needs a fix
There was a problem hiding this comment.
|
Thanks for your patience Nicholas! |
|
just realized, we need to add support for other cuda datatypes than float |
|
he seems to be checking for Cuda*Tensor, is that not sufficient? |

This PR adds
nn.GPUwhich can be used to distribute modules across multiple devices.OMG why is it in nn and not in cunn? As discussed with @soumith, putting in nn means that it can be used in CPU-only environments, which is common for production.
The unit tests are located in cunn to avoid a
pcall(function() require cunn end)in thennunit tests (see PR torch/cunn#282).