Search code examples
moduleluaneural-networktorch

Torch, neural nets - forward function on gmodule object - nngraph class


I am a newbie to torch and lua (as anyone who has been following my latest posts could attest :) and have the following question on the forward function for the gmodule object (class nngraph).

as per the source code (https://github.com/torch/nn/blob/master/Module.lua - as class gmodule inherits from nn.module) the syntax is:

function Module:forward(input)
   return self:updateOutput(input)
end

However, I have found cases where a table is passed as input, as in:

local lst = clones.rnn[t]:forward{x[{{}, t}], unpack(rnn_state[t-1])}

where:

clones.rnn[t]

is itself a gmodule object. In turn, rnn_state[t-1] is a table with 4 tensors. So in the end, we have something akin to

result_var = gmodule:forward{[1]=tensor_1,[2]=tensor_2,[3]=tensor_3,...,[5]=tensor_5}

The question is, depending on the network architecture, can you pass input - formatted as table - not only to the input layer but also to the hidden layers?

In that case, you have to check that you pass exactly one input per layer? (with the exception of the output layer)

Thanks so much


Solution

  • I finally found the answer. The module class (as well as the inherited class gmodule) has an input and an output.

    However, the input (as well as the output) needs not be a vector, but it could be a collection of vectors - that depends on the neural net configuration, in this particular case it is a pretty complex recursive neural net.

    So if the net has more than one input vector, you can do:

    result_var = gmodule:forward{[1]=tensor_1,[2]=tensor_2,[3]=tensor_3,...,[5]=tensor_5}
    

    where each tensor/vector is one of the input vectors. Only one of those vectors is the X vector, or the feature vector. The others could serve as input to other intermediate nodes.

    In turn, result_var (which is the output) can have one output as tensor (the prediction) or a collection of tensors as output (a collection of tensors), depending on the network configuration.

    If the latter is the case, one of those output tensors is the prediction, and the reminder are usually used as input to the intermediate nodes in the next time step - but that again depends on the net configuration.