I don't understand why is the number of input and output dimensions 2 * config.hidden_dim
while applying a fully connected layer in the encode class (mentioned in the last line)?
class Encoder(nn.Module):
def __init__(self):
super(Encoder, self).__init__()
self.embedding = nn.Embedding(config.vocab_size, config.emb_dim)
init_wt_normal(self.embedding.weight)
self.lstm = nn.LSTM(
config.emb_dim, config.hidden_dim, num_layers=1,
batch_first=True, bidirectional=True)
init_lstm_wt(self.lstm)
self.W_h = nn.Linear(
config.hidden_dim * 2, config.hidden_dim * 2, bias=False)
The code has been taken from https://github.com/atulkum/pointer_summarizer/blob/master/training_ptr_gen/model.py Please Explain
The reason is that the LSTM layer bidirectional, i.e., there are in fact two LSTMs each of them processing the input from each direction. They both return vectors of dimension config.hidden_dim
which get concatenated into vectors of 2 * config.hidden_dim
.