Search code examples
pythonmachine-learningconfusion-matrix

Constructing a confusion matrix from data without sklearn


I am trying to construct a confusion matrix without using the sklearn library. I am having trouble correctly forming the confusion matrix. Here's my code:

def comp_confmat():
    currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2]    
    predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1]
    cm = []
    classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classes

    for c1 in range(1,classes+1):#for every true class
        counts = []
        for c2 in range(1,classes+1):#for every predicted class
            count = 0
            for p in range(len(currentDataClass)):
                if currentDataClass[p] == predictedClass[p]:
                    count += 1
            counts.append(count)
        cm.append(counts)
    print(np.reshape(cm,(classes,classes)))

However this returns:

[[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]]

But I don't understand why each iteration results in 7 when I am reseting the count each time and it's looping through different values?

This is what I should be getting (using the sklearn's confusion_matrix function):

[[3 0 0 0 1]
[2 1 0 1 0]
[0 1 3 0 0]
[0 1 0 0 0]
[0 1 1 0 0]]

Solution

  • In your innermost loop, there should be a case distinction: Currently this loop counts agreement, but you only want that if actually c1 == c2.

    Here's another way, using nested list comprehensions:

    currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2]    
    predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1]
    
    classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classes
    
    counts = [[sum([(currentDataClass[i] == true_class) and (predictedClass[i] == pred_class) 
                    for i in range(len(currentDataClass))])
               for pred_class in range(1, classes + 1)] 
               for true_class in range(1, classes + 1)]
    counts    
    
    [[3, 0, 0, 0, 1],
     [2, 1, 0, 1, 0],
     [0, 1, 3, 0, 0],
     [0, 1, 0, 0, 0],
     [0, 1, 1, 0, 0]]