Trying to train this model
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)
length = 60
n_features = X_train_s.shape[1]
batch_size = 1
early_stop = EarlyStopping(monitor = 'val_accuracy', mode = 'max', verbose = 1, patience = 5)
generator = TimeseriesGenerator(data = X_train_s,
targets = Y_train[['TARGET_KEEP_LONG',
'TARGET_KEEP_SHORT',
'TARGET_STAY_FLAT']],
length = length,
batch_size = batch_size)
RNN_model = Sequential()
RNN_model.add(LSTM(180, activation = 'relu', input_shape = (length, n_features)))
RNN_model.add(Dense(3))
RNN_model.compile(optimizer = 'adam', loss = 'binary_crossentropy')
validation_generator = TimeseriesGenerator(data = X_test_s,
targets = Y_test[['TARGET_KEEP_LONG',
'TARGET_KEEP_SHORT',
'TARGET_STAY_FLAT']],
length = length,
batch_size = batch_size)
RNN_model.fit(generator,
epochs=20,
validation_data = validation_generator,
callbacks = [early_stop])
I get the error "KeyError: 60" where actually 60 is the value of the variable "length" (if I change it, the error changes accordingly).
The shapes of the training dataset are
X_test_s.shape
(114125, 89)
same for X_train_s.shape as well as n_features == 89.
It was exhausting to find the cause due to the poor and misleading error message. Anyway, the trouble was on the target data set form, the TimeseriesGenerator does not accept panda dataframes, just np.arrays. Therefore this
generator = TimeseriesGenerator(data = X_train_s,
targets = Y_train[['TARGET_KEEP_LONG', 'TARGET_KEEP_SHORT', 'TARGET_STAY_FLAT']], length = length, batch_size = batch_size)
shall have been written as
generator = TimeseriesGenerator(X_train_s, pd.DataFrame.to_numpy(Y_train[['TARGET_KEEP_LONG', 'TARGET_KEEP_SHORT', 'TARGET_STAY_FLAT']]), length=length, batch_size=batch_size)
in the case of just one target, it was enough
generator = TimeseriesGenerator(data = X_train_s, targets = Y_train['TARGET_KEEP_LONG'], length = length, batch_size = batch_size)
just one level of squared brackets, not two.