Search code examples
theanotheano.scan

How to change inputs of the function given in theano.scan()?


I have confusion about theano.scan(). I have read the official documentation but i still feel my knowledge is limited. I want to change the inputs of the function given in theano.scan. For example, i have the following code.

def forward_prop_step(x_t, s_t1_prev, s_t2_prev):
    # i have my code here

[o, s, s2], updates = theano.scan(
            forward_prop_step,
            sequences=x,
            truncate_gradient=self.bptt_truncate,
            outputs_info=[None, 
                          dict(initial=T.zeros(self.hidden_dim)),
                          dict(initial=T.zeros(self.hidden_dim))])

So, here theano.scan runs over the sequence x. As far as i have understood, forward_prop_step gets input x_t when theano.scan goes through sequence x but how forward_prop_step gets the second and third parameter? Is theano.scan getting 2nd and 3rd parameter from the 2nd and 3rd value of the outputs_info?

If i want to modify the above code and want to give one more paramter x2 as a sequence to theano.scan, how should i modify the code? I want theano.scan to run over two sequences x and x2 and give their values as the first two parameters (x and x2) of the forward_prop_step method. For example, prototype of forward_prop_step will be:

def forward_prop_step(x_t, x_f, s_t1_prev, s_t2_prev):
    # i have my code here

How can i change the above mentioned code on theano.scan to give both x and x2 as sequence? Can anyone briefly explain how can i change paramters of the functions given to theano.scan and also the return values with examples?

Few other questions:

(1) If i am giving n_steps parameter along with sequences parameter, how theano.scan executes? Is theano.scan then works like a nested (two) for loop?

(2) How the parameter non_sequences is different from sequences in theano.scan function's parameter?

(3) Does theano.scan call the provided function for each element of the sequence parameter? If it does, then when I write a print statement inside the forward_prop_step function, the print statement executed only once though the computation inside the function executed for several times (gone through the entire sequence). How theano.scan repeatedly calls the method which provided to it?


Solution

  • Is theano.scan getting 2nd and 3rd parameter from the 2nd and 3rd value of the outputs_info?

    -> Yes, if an element of outputs_info is not None it means it is a recurrent output and therefore has to be passed to the step function.

    If i want to modify the above code and want to give one more paramter x2 as a sequence to theano.scan, how should i modify the code?

    -> You just need to include x2 to the list of sequences

    [o, s, s2], updates = theano.scan(
            forward_prop_step,
            sequences=[x, x2],
            truncate_gradient=self.bptt_truncate,
            outputs_info=[None, 
                          dict(initial=T.zeros(self.hidden_dim)),
                          dict(initial=T.zeros(self.hidden_dim))])
    

    The order in which parameters appear in the step function is (elements of): sequences, outputs_info and non_sequences (only if they are specified in the scan).

    (1) If i am giving n_steps parameter along with sequences parameter, how theano.scan executes? Is theano.scan then works like a nested (two) for loop?

    If n_steps is provided scan will only iterate for these number of iterations. If you are iterating over a tensor whose 1st dimension has 10 elements and n_steps is 4 then scan will only iterate over first 4 'elements' of that tensor. It will not work like a nested loop.

    (2) How the parameter non_sequences is different from sequences in theano.scan function's parameter?

    non_sequences are not iterated upon by scan, they are only mentioned for clarity of code as they are used within the step function, obviously scan can figure out them on it's own and hence they are only optional, not mandatory (though recommended). Contrary, sequences specify the variables scan should iterate over as it loops.

    (3) Does theano.scan call the provided function for each element of the sequence parameter? If it does, then when I write a print statement inside the forward_prop_step function, the print statement executed only once though the computation inside the function executed for several times (gone through the entire sequence). How theano.scan repeatedly calls the method which provided to it?

    theano.scan iterates over first dimension of each elements within the sequences[] at once, and during each iteration it calls the step function. If you want to print intermediate computations inside scan, you should use theano.printing.Print (check this link for details). The reason print statement gets executed only once is because the way Theano works is it builds a computation graph as it scans the code and afterwards only this computation graph is executed with the respective values, python's print cannot be part of made part of theano's computation graph and hence you see it only once.

    I would suggest have a deeper look at the documentation, and this tutorial.