python tensorflow machine-learning keras

Disagreement in confusion matrix and accuracy when using data generator

I was working on a model based on the following code

epoch=100
model_history = model.fit(train_generator, 
epochs=epoch,
validation_data=test_generator,
callbacks=[model_es, model_rlr, model_mcp])

After model training when I evaluated the model using the following code, I get an accuracy of 98.3%

model.evaluate(test_generator)

41/41 [==============================] - 3s 68ms/step - loss: 0.0396 - accuracy: 0.9893 [0.039571091532707214, 0.9893211126327515]

In order to analyse the result, I tried to obtain a confusion matrix of the test_generator using the following code

y_pred = model.predict(test_generator)
y_pred = np.argmax(y_pred, axis=1)
print(confusion_matrix(test_generator.classes, y_pred))

However the output is

[[ 68  66  93  73]
 [ 64  65  93  84]
 [ 91 102 126  86]
 [ 69  75  96  60]]

which highly disagrees with the model_evaluate

Can anyone help me out here to obtain the actual confusion matrix for the model

plot history of model accuracy

Entire code: https://colab.research.google.com/drive/1wpoPjnSoCqVaA--N04dcUG6A5NEVcufk?usp=sharing

Solution

From your code, change:

test_generator=train_datagen.flow_from_directory(
    locat_testing,
    class_mode='binary',
    color_mode='grayscale',
    batch_size=32,
    target_size=(img_size,img_size)
)

To include the shuffle parameter:

test_generator=train_datagen.flow_from_directory(
    locat_testing,
    class_mode='binary',
    color_mode='grayscale',
    batch_size=32,
    target_size=(img_size,img_size),
    shuffle=False
)

Your confusion matrix will look a lot more accurate instead of what looks like randomly guessing.