Search code examples
pythonkerascross-validationdata-augmentation

Method of visualizing fit of cross validated model


How do I go about writing code to visualize the progress of my accuracy and loss development over training when using cross validation? Normally I would assign the variable name 'history' to the fit function when training the model, but in the case of cross validation it does not display the validation curves. I assume this is the case because I am not calling validation_data within the fit function (below).

kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
cvscores = []
for train, test in kfold.split(x_train, y_train):
   model = Sequential()
   model.add(layers.Conv2D(32,(4,4), activation = 'relu', input_shape = (224,224,3)))
   model.add(layers.MaxPooling2D((2, 2)))
   model.add(layers.Conv2D(64, (4,4), activation = 'relu'))
   model.add(layers.MaxPooling2D((2, 2)))
   model.add(layers.Flatten())
   model.add(layers.Dense(64, activation = 'relu'))
   model.add(layers.Dropout(0.5))
   model.add(layers.Dense(1, activation = 'sigmoid'))
	 
   model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
	
   history = model.fit(x_train[train], y_train[train], epochs=15, batch_size=64)
	
   scores = model.evaluate(x_train[test], y_train[test], verbose=0)
   print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
   cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))

Normally I would use code such as below, but since I do not have the validation data within fit, I am not sure how to approach it.

plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

]1]1


Solution

  • You can dump everything using TensorBoard. Generally, you make following splits: train, test, validation. You validate your model using that 3rd split. You can use your metrics from sklearn. Most of the time people don't cross-validate their DNN models, as it'd take too much time. However, once you have those models, it'd be nice to plot distribution of metrics using some boxplot.