I have the data that working on Multi-Layer Perceptron architecture looks like this
X_train_feature.shape
(52594, 16)
X_train_feature[0]
array([1.18867208e-03, 1.00000000e+00, 8.90000000e+01, 8.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00])
y_train
(52594, 2)
y_train[0].toarray()
array([[0., 1.]])
By the first dimension: number samples and the second: for the X_train is the number of features and in the y_train is one-hot encoder.
And I want to use the same data on LSTM/Bi-LSTM, so I copy the code from the internet and change a input value to same as MLP
def define_model():
model = Sequential()
model.add(LSTM(20, input_shape=X_train_feature[0].shape, return_sequences=True))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy']) # compile
print('Total params: ', model.count_params())
return model
But When I try to create a model the error about input shape will be append
model = define_model()
ValueError: Input 0 is incompatible with layer lstm_30: expected ndim=3, found ndim=2
What should I need to adjust my data to apply on LSTM or I need to change an architecture config? Thank you so much.
LSTM (unlike a perceptron) is not a feed-forwarding network. It needs a history to predict the next point. So, a proper input tensor to a LSTM should be of shape (timesteps, num_features)
meaning that each sample is a sequence of timesteps
observations such that the cell state is initiated in the first observation of the sequence and goes through the entire sequence.
Therefore, the input tensor should have the shape (num_sequences, seq_length, num_features)
where:
num_sequences
: number of samples, i.e. how many sequences do you have to train the model?
seq_length
: How long these sequences are. for variable-length sequences, you can supply None
.
num_features
: How many features does have a single observation in a given sequence?