Not sure about why I'm getting an error with my LSTM neural network. It seems to be related with the input shape.
This is my neural network architecture:
from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout
model = Sequential()
# Recurrent layer
model.add(LSTM(64, return_sequences=False,
dropout=0.1, recurrent_dropout=0.1))
# Fully connected layer
model.add(Dense(64, activation='relu'))
# Dropout for regularization
model.add(Dropout(0.5))
# Output layer
model.add(Dense(y_train.nunique(), activation='softmax'))
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
This is how I train it:
history = model.fit(X_train_padded, y_train_padded,
batch_size=2048, epochs=150,
validation_data=(X_test_padded, y_test_padded))
This is the shape of my input data:
print(X_train_padded.shape, X_test_padded.shape, y_train_padded.shape, y_test_padded.shape)
(98, 20196, 30) (98, 4935, 30) (98, 20196, 1) (98, 4935, 1)
This is part of my X_train_padded:
X_train_padded
array([[[ 2.60352379e-01, -1.66420518e-01, -3.12893162e-01, ...,
-1.51210476e-01, -3.56188897e-01, -1.02761131e-01],
[ 1.26103191e+00, -1.66989382e-01, -3.13025807e-01, ...,
6.61329839e+00, -3.56188897e-01, -1.02761131e-01],
[ 1.04418243e+00, -1.66840157e-01, -3.12994596e-01, ...,
-1.51210476e-01, -3.56188897e-01, -1.02761131e-01],
...,
[ 1.27399408e+00, -1.66998426e-01, -3.13025807e-01, ...,
6.61329839e+00, -3.56188897e-01, -1.02761131e-01],
This is the error that I'm getting:
Epoch 1/150
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-52-52422b54faa4> in <module>
----> 1 history = model.fit(X_train_padded, y_train_padded,
2 batch_size=2048, epochs=150,
3 validation_data=(X_test_padded, y_test_padded))
...
ValueError: Shapes (None, 20196) and (None, 12) are incompatible
As I'm using a LSTM layer, I have a 3D input shape. My output layer has 12 nodes (y_train.nunique()) because I have 12 different classes in my input. Given that I have 12 classes, I'm using softmax as activation function in my output layer and categorical_crossentropy as my loss function.
EDIT:
Let me try to explain better my dataset:
I'm dealing with geological wells. My samples are different types of sedimentary rocks layers, where the features are the rocks' properties (such as gammay ray emission) and the label is the rock type (such as limestone). One of my features is the depth of the layer.
The idea behing using an LSTM in this case, is to consider the depth of a well as a sequence. So that the previous sedimentary layer (rock) helps to predict the next sedimentary layer (rock).
How did I get to my input shape:
I have a total of 98 wells in my dataset. I splitted the dataset: X_train_init, X_test_init, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
. The well with the most samples (layers) has, in the training set, 20196 samples. The wells that didn't have this many samples, I padded them with zeros so that they had 20196 samples. The well with the most samples (layers) has, in the test set, 4935 samples. The wells that didn't have this many samples, I padded them with zeros so that they had 4935 samples. Removing the well feature and the depth feature (among other features) I ended up with 30 features total. My y_train
and y_test
has only 1 column which represents the label.
I guess that my problem is actually getting this dataset to work in a LSTM. Most of the examples that I see, don't have 98 different time series, they just have one. I'm not really sure about how to deal with 98 different time series (wells).
It won't work. Except for the batch size, every other input dimension should be the same. Also, your inputs dimensions are all going crazy. for example -
print(X_train_padded.shape, # (98, 20196, 30)
X_test_padded.shape, # (98, 4935, 30)
y_train_padded.shape, # (98, 20196, 1)
y_test_padded.shape) # (98, 4935, 1)
from what I see the first dimension is supposed to represent the total number of samples (in x_train,y_train, and x_test,y_test) but in your case, the total samples are represented by the second dimension. The first dimension should be in second place. So as to say the dimensions should
print(X_train_padded.shape, # (20196, 98, 30)
X_test_padded.shape, # (4935, 98, 30)
y_train_padded.shape, # (20196, 98, 1)
y_test_padded.shape) # (4935, 98, 1)
This will put everything in the right place. You just need to look at how you came to the wrong dimensions and change that part.