python conv-neural-network multiclass-classification

MultiClass Classification Confusion Matrix

Im building a CNN Classification model to with classes = [Pneumonia, Healthy, TB], i already made some code to build the model and it went pretty well. But the problem is the confusion matrix is a little bit weird for me. I got a pretty well result for the accuracy of the testing_data with the accuracy of around 85%. i use this code for making the confusion matrix :

from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.metrics import confusion_matrix

def confussion_matrix(test_true, test_pred, test_class):
    cm = confusion_matrix(test_true, test_pred)
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=test_class)
    fig, ax = plt.subplots(figsize=(15,15))
    disp.plot(ax=ax,cmap=plt.cm.Blues)
    plt.show()

testing_pred_raw = model.predict(testing_generator)
testing_pred = np.argmax(testing_pred_raw, axis=1)
testing_true = testing_generator.classes
testing_class = ['Pneumonia', 'Sehat', 'TB']
confussion_matrix(testing_true, testing_pred, testing_class)

Note : Sehat = Healthy

i already got the confusion matrix but the spread is pretty weird (is like is not even 85% accuracy). Heres the result :

enter image description here

Is this result is correct (maybe because i read the confusion matrix incorrectly) or theres something in my code that can be modify?

I already tried the thing as above

Solution

So its seems the problem is not on the data or the CNN Model. my problem can be fix by just set the shuffle on my test and validation generator to False.

def data_generators(TRAINING_DIR, VALIDATION_DIR, TESTING_DIR):

  training_datagen = ImageDataGenerator(rescale = 1./255,
                                        horizontal_flip = True)

  training_generator = training_datagen.flow_from_directory(TRAINING_DIR,
                                                    batch_size = 32,
                                                    class_mode = 'categorical',
                                                    target_size = (128, 128))
  
  validation_datagen = ImageDataGenerator(rescale = 1./255,
                                          horizontal_flip = True)
  
  validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR,
                                                    batch_size = 32,
                                                    class_mode = 'categorical',
                                                    shuffle = False,
                                                    target_size = (128, 128))
  
  testing_datagen = ImageDataGenerator(rescale = 1./255)
    
  testing_generator = testing_datagen.flow_from_directory(TESTING_DIR,
                                                    batch_size = 32,
                                                    class_mode = 'categorical',
                                                    shuffle = False,
                                                    target_size = (128,128))

  return training_generator, validation_generator, testing_generator

But i dont know why if i set the shuffle = True in validation data, the prediction is still good. But the important part is setting the Shuffle = False in the testing data.