Search code examples
pandastensorflowmatplotlibgoogle-colaboratoryconfusion-matrix

Can't visualize plotted Confusion Matrix


I am new to ML and learning the fundamentals. I am working on Dog-vision dataset (https://www.kaggle.com/c/dog-breed-identification) and I am trying to plot a confusion matrix but can't get where I am doing wrong, need help!

My true_label looks like this

true_label[:10]
array([26, 96,  8, 15,  3, 10, 62, 82, 92, 16]

And predicted_label looks like this

predicted_l[:10]
array([26, 96,  8, 15,  3, 10, 62, 82, 92, 16]

They are almost same but not the whole elements in the array are same.

Then I had converted them into a panda dataframe, with code like this

import pandas as pd
from sklearn.metrics import confusion_matrix
classes=[]
for i in range(0, 99):
  classes.append(i)

cf_matrix = confusion_matrix(true_l, predicted_l)
cf_matrix_df = pd.DataFrame(cf_matrix, index=classes,columns=classes)
cf_matrix_df

And then the output is like this- enter image description here

Then I tried to plot the confusion matrix with this dataframe but it's not being plotted in correct manner. Here is the code and the output of my confusion matrix:-

import seaborn as sns
figure = plt.figure(figsize=(8, 8))
sns.heatmap(cf_matrix_df, annot=True,cmap=plt.cm.Blues)
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.show()

Output enter image description here

If you need more info then please have a look on my notebook here. https://colab.research.google.com/drive/1SoXJJNTnGx39uZHizAut-HuMtKhQQolk?usp=sharing


Solution

  • You can make your plot better by removing annot=True argument, since it writes the data value in each cell. Simply remove this argument to get a better visualization:

    sns.heatmap(cf_matrix_df, cmap=plt.cm.Blues)
    

    UPDATE: Increasing the figure size figsize() will help to make visualization more clearer.