I am pretty new to Machine Learning, and I am trying to use Google Colab with Tensorflow/Keras to train an image classification model using transfer learning (Resnet50).
I started by using image datasets, using the following code:
data_root = '/tmp/OCT2017'
batch_size = 32
img_height = 160
img_width = 160
data_train = tf.keras.preprocessing.image_dataset_from_directory(data_root + '/train',labels='inferred',
image_size=(img_height,img_width),
batch_size=batch_size)
For small testing datasets, this worked pretty well, and I got both good accuracy and good predictions. But while trying to use larger datasets, all the RAM provided by Colab was consumed, so I switched to generators, using:
data_generator = tf.keras.preprocessing.image.ImageDataGenerator()
data_train_gen = data_generator.flow_from_directory(data_root + '/train',
target_size=(img_height,img_width),
class_mode='sparse',
batch_size=batch_size,
shuffle=False)
and trained the model using:
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
with tf.device('/device:GPU:0'):
epochs = 10
history =model.fit(
data_train_gen,
validation_data=data_val_gen,
epochs=epochs,
callbacks=[csv_logger]
)
I got good accuracy using this setup:
model.evaluate(data_test)
31/31 [==============================] - 3s 93ms/step - loss: 0.0925 - accuracy: 0.9742
[0.09248838573694229, 0.9741735458374023]
However, when asking for predictions, in order to make a confusion matrix, I got awful results
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
y_pred = model.predict(data_test)
predicted_categories = tf.argmax(y_pred, axis=1)
true_categories = tf.concat([y for x, y in data_test_gen], axis=0)
cm = confusion_matrix(predicted_categories, true_categories)
heatmap = sns.heatmap(cm, annot=True, cmap='YlGn', xticklabels=['CNV','DME','DRUSEN','NORMAL'],yticklabels=['CNV','DME','DRUSEN','NORMAL'])
plt.xlabel("True Labels")
plt.ylabel("Predictions")
plt.show()
The predictions were around 40% correct The confusion matrix appeared completely random
classification_report(true_categories, predicted_categories, target_names=class_names, output_dict=True)
{'CNV': {'f1-score': 0.256198347107438, 'precision': 0.256198347107438, 'recall': 0.256198347107438, 'support': 242}, 'DME': {'f1-score': 0.23236514522821577, 'precision': 0.23333333333333334, 'recall': 0.23140495867768596, 'support': 242}, 'DRUSEN': {'f1-score': 0.25311203319502074, 'precision': 0.25416666666666665, 'recall': 0.25206611570247933, 'support': 242}, 'NORMAL': {'f1-score': 0.2827868852459016, 'precision': 0.2804878048780488, 'recall': 0.28512396694214875, 'support': 242}, 'accuracy': 0.256198347107438, 'macro avg': {'f1-score': 0.256115602694144, 'precision': 0.25604653799637167, 'recall': 0.256198347107438, 'support': 968}, 'weighted avg': {'f1-score': 0.256115602694144, 'precision': 0.2560465379963717, 'recall': 0.256198347107438, 'support': 968}}
You do not show your test generator but make sure you set shuffle=False. What is the difference between data_test and data_test_gen? If you have a test generator called data_test you can get the true labels from
true_categories=data_test.labels
Then use these in the confusion matrix and classification report