Search code examples
pythonmachine-learningconfusion-matrix

Calculating Confusion Matrix by Using the Array of Arrays


I am using transformers and datasets libraries to train an multi-class nlp model for real specific dataset and I need to have an idea how my model performs for each label. So, I'd like to calculate the confusion matrix. I have 4 labels. My result.prediction looks like

array([[ -6.906 ,  -8.11  , -10.29  ,   6.242 ],
       [ -4.51  ,   3.705 ,  -9.76  ,  -7.49  ],
       [ -6.734 ,   3.36  , -10.27  ,  -6.883 ],
       ...,
       [  8.41  ,  -9.43  ,  -9.45  ,  -8.6   ],
       [  1.3125,  -3.094 , -11.016 ,  -9.31  ],
       [ -7.152 ,  -8.5   ,  -9.13  ,   6.766 ]], dtype=float16)

In here when predicted value is positive then model predicts 1, else model predicts 0. Next my result.label_ids looks like

array([[0., 0., 0., 1.],
       [1., 0., 0., 0.],
       [0., 0., 0., 1.],
       ...,
       [1., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 0., 0., 1.]], dtype=float32)

As you can see model return an array of 4, and give 0 values to false labels and 1 to true values.

In general, I've been using the following function to calculate confusion matrix, but in this case it didn't work since this function is for 1 dimensional arrays.

import numpy as np

def compute_confusion_matrix(labels, true, pred):

  K = len(labels) # Number of classes 
  result = np.zeros((K, K))

  for i in range(labels):
    result[true[i]][pred[i]] += 1

  return result

If possible I'd like to modify this function suitable for my above case. At least I would like to understand how can I implement confusion matrix for results that in the form multi dimensional arrays.


Solution

  • First of all I would like to thank Nicola Fanelli for the idea.

    The function I gave above as well as the sklearn.metrics.confusion_matrix() both need to be provided a list of predicted and true values. After my prediction step, I try to retrieve my true and predicted values in order to calculate a confusion matrix. The results I was getting are in the following form

    array([[0., 0., 0., 1.],
           [1., 0., 0., 0.],
           [0., 0., 0., 1.],
           ...,
           [1., 0., 0., 0.],
           [1., 0., 0., 0.],
           [0., 0., 0., 1.]], dtype=float32)
    

    Here the idea is to retrieve the positional index of the value 1. When I tried the approach suggested by Nicola Fanelli , the resulting sizes were lower then the initial ones and they weren't matching. Therefore, confusion matrix cannot be calculated. To be honest I couldn't find the reason behind it, but I'll investigate that more later.

    So, I use a different technique to implement the same idea. I used np.argmax() and append these positions to a new list. Here is the code sample for true values

    true = []
    for i in range(len(result.label_ids)):
        n = np.array(result.label_ids[i])
        true.append(np.argmax(n))
    

    This way I got the results in the desired format without my sizes are being changed.

    Even though this is a working solution for my problem, I am still open to more elegant ways to approach this problem.