I have been trying to classify autism and have a CNN model. The best accuracies so far from papers is around 70-73%~ and my model has been getting around 65-70% with different parameters. I have finally found a hyper parameter combination that gives a 70%+ accuracy when tested with a test set (around 10% of the data set, 10% used on validation and 80% for training). I decided to do a 10 fold cross validation and check with verbose 1 for each epoch. The first run gave around 68-76% validation accuracy per epoch (25 epochs in total) and a 72% on score. However, from the second batch of 25 epochs, the val accuracy is around 98-100% and accuracy keeps being at 1.000. Third batch is similar with 100% popping up. Is this normal? I haven't worked with this so far, the code I used is a template for CNN k-Fold cross validation.
from sklearn.model_selection import KFold
import numpy as np
# data should be of shape (838, 392, 392, num_channels)
data = conn_matrices
# labels should be of shape (838,)
labels = y
# Initialize 10-fold cross-validation
kf = KFold(n_splits=10, shuffle=True, random_state=42)
# Create lists to store the results of each fold
fold_accuracies = []
# Perform cross-validation and store the results
for train_index, test_index in kf.split(data):
X_train, X_test = data[train_index], data[test_index]
y_train, y_test = labels[train_index], labels[test_index]
# Define and compile your Keras-based CNN model
# Replace 'your_cnn_model' with your actual model
your_cnn_model = model
# Train the model on the training data
your_cnn_model.fit(X_train, y_train, epochs=25,
batch_size=32, validation_data=(X_test, y_test), verbose=1)
# Evaluate the model on the test data
accuracy = your_cnn_model.evaluate(X_test, y_test)[1]
fold_accuracies.append(accuracy)
# Print the accuracy of each fold
for i, accuracy in enumerate(fold_accuracies):
print(f"Fold {i+1} Accuracy: {accuracy:.4f}")
# Calculate and print the mean accuracy and standard deviation of the results
mean_accuracy = np.mean(fold_accuracies)
std_deviation = np.std(fold_accuracies)
print(f"Mean Accuracy: {mean_accuracy:.4f}")
print(f"Standard Deviation: {std_deviation:.4f}")
Expected each runs to have similar accuracies of around 70 to maximum of 76-77%
You are giving the model the test data when training which is likely using the test data to fit some model parameters/hyperparameters, so of course it will overfit on that and give over-optimistic scores when testing on the same data it already knows:
# Train the model on the training data
your_cnn_model.fit(X_train, y_train, epochs=25, batch_size=32, validation_data=(X_test, y_test), verbose=1)
You need to use nested cross validation to find hyperparameters: https://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html