Search code examples
luatorch

nn.DataParallelTable fails for custom layers


The multi-gpu model init code:

local dpt = nn.DataParallelTable(1, true, true)
         :add(model, gpus)
         :threads(function()
            local cudnn = require 'cudnn'
            cudnn.fastest, cudnn.benchmark = fastest, benchmark
         end)
dpt.gradInput = nil

model = dpt:cuda()

get errors when processing model:parameters() or model:getParameters():

FATAL THREAD PANIC: (read) /home/daniel/torch/install/share/lua/5.2/torch/File.lua:343: unknown Torch class <nn.Reorg>
FATAL THREAD PANIC: (read) /home/daniel/torch/install/share/lua/5.2/torch/File.lua:343: unknown Torch class <nn.Reorg>

<nn.Reorg> is my custom layer defined in models/Reorg.lua which just does simple copy operations on the layer input.
It works fine in the CPU and single GPU cases.


Solution

  • I finally figure it out, just add one line of code:

    :threads(function()
                require 'models/Reorg'
                local cudnn = require 'cudnn'
                cudnn.fastest, cudnn.benchmark = fastest, benchmark
             end)
    

    The threads didn't load the file (why???), I have to load it manually...