I am a Korean student studying natural language processing.
When I put a sentence in LSTM, I want to put it in a 3-dimensional array.
I'm going to change sentence 나는 집에서 친구와 밥을 먹었다.
like [['나', '는'],['집','에서'],['친구','와'],['밥','을'],['먹','었','다','.']]
and I will change words to vector (embedding). like:
[[ [1,0,0,0, ... ], [0,1,0,0, ... ] ], [ [0,0,1,0, ... ], [0,0,0,1, ...] ], [...], ...]
So, here's a 3-dimensional array in one sequence.
How do I input 3-dimensional array to one LSTM?????
Or, Is this possible??? Stupid question, I'm sorry.
I don't know how to change it.
class Tagger(nn.Module):
def __init__(self,
embedding_dim=43,
n_layers=4,
dropout=0.2,
output_dim=11,
bidirectional=False):
super(WordTagger, self).__init__()
self.lstm = nn.LSTM(embedding_dim, embedding_dim,
num_layers=n_layers, bidirectional=bidirectional)
self.hidden_dim = (embedding_dim * 2) if bidirectional else embedding_dim
self.linear = nn.Linear(self.hidden_dim, output_dim)
self.dropout = nn.Dropout(dropout)
def forward(self, embedded):
lstmed, _ = self.lstm(embedded)
outputs = self.linear(self.dropout(lstmed))
scores = F.log_softmax(outputs, dim=1)
return scores
First of all use English. It could be better to get answer.
And, see here it could be helpful.
One hot encoding is popular for struct unordered matrix. search 'one hot encoding' in Korean with google. It could be find.