I am training a model (many to many) using LSTM RNN. I will apply cross validation to improve the result quality, but I cannot use the 'metrics.mean_squared_error' function because it is a multivariate system. Should I create a cross validation function manually or can I work with this function using 3D arrays?
Here is the shapes of my train and test data;
X_train1.shape, y_train1.shape, X_test1.shape, y_test1.shape
((118000, 50, 9), (118000, 1, 9), (51950, 50, 9), (51950, 1, 9))
And here is the code I have tried:
from sklearn.model_selection import KFold
from tensorflow.keras.layers import Dense, Activation
from sklearn import metrics
# Cross-Validate
kf = KFold(5, shuffle=True, random_state=42) # Use for KFold classification
oos_y = []
oos_pred = []
fold = 0
for train, test in kf.split(trainX):
fold+=1
print(f"Fold #{fold}")
x_train = X_train1
y_train = y_train1
x_test = X_test1
y_test = y_test1
model = Sequential()
model.add(LSTM(128, activation='relu', input_shape=(X_train1.shape[1], X_train1.shape[2]), return_sequences=True))
model.add(LSTM(64, activation='relu', return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(y_train1.shape[2]))
model.compile(optimizer='adam', loss='mse', metrics='mae')
model.summary()
history = model.fit(X_train1, y_train1, epochs=1, batch_size=16, validation_split=0.1, verbose=1)
pred = model.predict(x_test)
oos_y.append(y_test)
oos_pred.append(pred)
# Measure this fold's RMSE
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print(f"Fold score (RMSE): {score}")
# Build the oos prediction list and calculate the error.
oos_y = np.concatenate(oos_y)
oos_pred = np.concatenate(oos_pred)
score = np.sqrt(metrics.mean_squared_error(oos_pred,oos_y))
print(f"Final, out of sample score (RMSE): {score}")
# Write the cross-validated prediction
oos_y = pd.DataFrame(oos_y)
oos_pred = pd.DataFrame(oos_pred)
oosDF = pd.concat( [df, oos_y, oos_pred],axis=1 )
#oosDF.to_csv(filename_write,index=Fal
se)
If the y_true
and y_pred
shapes are (51950, 1, 9)
, reshape into (51950, 9)
and compute RMSE with:
rmse = mean_squared_error(
y_true.reshape(-1, 1*9),
y_pred.reshape(-1, 1*9),
squared=False, # Set to False for Root Mean Square Error
)
Or:
ex, d1, d2 = y_true.shape
y_true = y_true.reshape(ex, d1*d2)