I have confusion about theano.scan(). I have read the official documentation but i still feel my knowledge is limited. I want to change the inputs of the function given in theano.scan
. For example, i have the following code.
def forward_prop_step(x_t, s_t1_prev, s_t2_prev):
# i have my code here
[o, s, s2], updates = theano.scan(
forward_prop_step,
sequences=x,
truncate_gradient=self.bptt_truncate,
outputs_info=[None,
dict(initial=T.zeros(self.hidden_dim)),
dict(initial=T.zeros(self.hidden_dim))])
So, here theano.scan
runs over the sequence x
. As far as i have understood, forward_prop_step
gets input x_t
when theano.scan
goes through sequence x
but how forward_prop_step
gets the second and third parameter? Is theano.scan
getting 2nd and 3rd parameter from the 2nd and 3rd value of the outputs_info
?
If i want to modify the above code and want to give one more paramter x2 as a sequence to theano.scan
, how should i modify the code? I want theano.scan
to run over two sequences x
and x2
and give their values as the first two parameters (x
and x2
) of the forward_prop_step
method. For example, prototype of forward_prop_step
will be:
def forward_prop_step(x_t, x_f, s_t1_prev, s_t2_prev):
# i have my code here
How can i change the above mentioned code on theano.scan
to give both x
and x2
as sequence? Can anyone briefly explain how can i change paramters of the functions given to theano.scan
and also the return values with examples?
Few other questions:
(1) If i am giving n_steps
parameter along with sequences
parameter, how theano.scan
executes? Is theano.scan
then works like a nested (two) for loop?
(2) How the parameter non_sequences
is different from sequences
in theano.scan
function's parameter?
(3) Does theano.scan
call the provided function for each element of the sequence parameter? If it does, then when I write a print
statement inside the forward_prop_step
function, the print
statement executed only once though the computation inside the function executed for several times (gone through the entire sequence). How theano.scan
repeatedly calls the method which provided to it?
Is theano.scan
getting 2nd and 3rd parameter from the 2nd and 3rd value of the outputs_info
?
-> Yes, if an element of outputs_info
is not None
it means it is a recurrent output and therefore has to be passed to the step
function.
If i want to modify the above code and want to give one more paramter x2
as a sequence to theano.scan
, how should i modify the code?
-> You just need to include x2
to the list of sequences
[o, s, s2], updates = theano.scan(
forward_prop_step,
sequences=[x, x2],
truncate_gradient=self.bptt_truncate,
outputs_info=[None,
dict(initial=T.zeros(self.hidden_dim)),
dict(initial=T.zeros(self.hidden_dim))])
The order in which parameters appear in the step
function is (elements of): sequences
, outputs_info
and non_sequences
(only if they are specified in the scan
).
(1) If i am giving n_steps parameter along with sequences parameter, how theano.scan executes? Is theano.scan then works like a nested (two) for loop?
If n_steps
is provided scan
will only iterate for these number of iterations. If you are iterating over a tensor whose 1st dimension has 10
elements and n_steps
is 4
then scan will only iterate over first 4 'elements' of that tensor. It will not work like a nested loop.
(2) How the parameter non_sequences is different from sequences in theano.scan function's parameter?
non_sequences
are not iterated upon by scan
, they are only mentioned for clarity of code as they are used within the step
function, obviously scan
can figure out them on it's own and hence they are only optional, not mandatory (though recommended). Contrary, sequences
specify the variables scan
should iterate over as it loops.
(3) Does theano.scan
call the provided function for each element of the sequence parameter? If it does, then when I write a print
statement inside the forward_prop_step
function, the print
statement executed only once though the computation inside the function executed for several times (gone through the entire sequence). How theano.scan
repeatedly calls the method which provided to it?
theano.scan
iterates over first dimension of each elements within the sequences[]
at once, and during each iteration it calls the step
function. If you want to print intermediate computations inside scan
, you should use theano.printing.Print
(check this link for details). The reason print
statement gets executed only once is because the way Theano works is it builds a computation graph as it scans the code and afterwards only this computation graph is executed with the respective values, python's print
cannot be part of made part of theano's computation graph and hence you see it only once.
I would suggest have a deeper look at the documentation, and this tutorial.