Search code examples
pythonnumpytensorflowkeraslstm

multivariant LSTM input shape


I am trying to build LSTM model with multiple variables. I have a training data that is the shape of (1132, 5) including the label. So, I have 4 features. My question is how to reshape the data so it includes my features on the training and the test data. I did the following code

end_len = len(train_scaled)
X_train = []
y_train = []
timesteps = 40

for i in range(timesteps, end_len):
    X_train.append(train_scaled[i - timesteps:i, 0])
    y_train.append(train_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train) 

#train_scaled is the normalized data and it has shape of (1132, 5)

Once I did the above code my X_train became (1092, 40) how did it go from 1132 to 1092 ?

Also I did this to get it into a compatible shape for Keras

# reshape data so Keras can take it namely [samples, timesteps, features].
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
print("X_train --> ", X_train.shape)
print("y_train shape --> ", y_train.shape)

and my X_train was (1092, 40, 1)

It did not count my features as I see only one at the end.


Solution

  • If your train_scaled has the shape (1132, 5) , then you can split it :

     features = train_scaled[ : , 0:4]
     labels  =  train_scaled[ : , 4:5]