I have a tensor of shape (batch_size, max_sequence_length, embedding_size)
that is padded to maximum length to store sequences. I also have (batch_size, max_sequence_length, vocab)
, for example:
# Batch size many, (batch_size, 4, 8)
[2,4,1,4]
[7,4,2,0]
[6,0,0,0]
# Using EmbedID(ignore_label=0) to get (batch_size, 4, embeddeding_size)
How can we pass this to for example a NStepGRU
link in Chainer? and for example obtain the final hidden state of all the sequences (batch_size, embedding_size)
?
NStepGRU
accepts a batch of sequences as a list whose element is of shape (sequence_length, embedding_size)
. Note that padding is not needed here; each element can have a different length.
If you have a tensor x
of shape (batch_size, max_sequence_length, embedding_size)
and the lengths of the sequences lengths
, you can pass [x[i, :l] for i, l in enumerate(lengths)]
to NStepGRU
.
NStepGRU
returns ys
the output of the last layer and hs
the final hidden state. Since NStepGRU
may contain multiple layers, the final hidden state is provided for each layer; i.e., ys
has the shape (num_layers, batch_size, embedding_size)
. If you are using single layer NStepGRU
, just extracting hs[0]
returns the final hidden state of shape (batch_size, embedding_size)
.