Search code examples
nlppytorchrecurrent-neural-network

Bidirectional RNN Implementation pytorch


Hi I am trying to understand bidirectional RNN.

> class RNN(nn.Module):
> 

> 
>   def __init__(self,n_vocab,n_embed,hidden_size,output_size):
> 
>     super().__init__()
> 
>     self.hidden_size = hidden_size
> 
>     self.embedding = nn.Embedding(n_vocab+1,n_embed) ## n_vocab is unique words in dictionary ## n_embed is hyperparameter

>     self.rnn = nn.RNN(n_embed, hidden_size, num_layers = 1, batch_first = True,bidirectional = True) # 
> 
>     self.fc =  nn.Linear(hidden_size,output_size)   
> 
>   def forward(self,x):>
> 
>     x = x # input batch_size * seq_length
> 
>     batch_size = x.size(0) 
> 
>     #print('Batch Size is',batch_size)
> 
>     x = self.embedding(x) # batch-size x seq_length x embedding_dimension
> 
>     x,hidden =self.rnn(x)   #batch-size x seq_length x hidden_size
> 
> 
> 
>     return x,hidden

I am returning both hidden state and output while going through tutorials some says that I need to concatenate hidden state (torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim = 1)) and in some tutorials take output state (x[:,-1,:]) but both of results come difference.

What is the correct way of doing Bidirectional RNN.


Solution

  • Both ways are correct, depending on different conditions. If nn.RNN is bidirectional (as it is in your case), you will need to concatenate the hidden state's outputs. In case, nn.RNN is bidirectional, it will output a hidden state of shape: (num_layers * num_directions, batch, hidden_size). In the context of neural networks, when the RNN is bidirectional, we would need to concatenate the hidden states from two sides (LTR, and RTL). That's why you need to concatenate the hidden states using: torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim = 1) which is basically the same as x[:,-1,:]).