Search code examples
pythonbooleanconfusion-matrix

How to compare two different true or false columns and get a confusion matrix? Python


So I have 2 different true or false results that tested the same column. So test 1 has the wrong results and test 2 has the correct results. Is there python code that can compare these two results and obtain a confusion matrix result (true positives, false positives, false negatives, and true negatives)?

For example:

Test1
a  True
b  True
c  False
d  False
e  True
f  True
g  True



Test2
a  True
b  True
c  True
d  True
e  True
f  True
g  False

Solution

  • You can do this with numpy

    I will ignore the fact that the tests have letters, and just use an array instead

    #assume: 
    #reponses = [...list of booleans...]
    #ground_truth = [...list of booleans...]
    
    import numpy as np
    responses = np.array(responses)
    ground_truth = np.array(ground_truth)
    
    true_positives = np.logical_and(responses,ground_truth)
    true_negatives = np.logical_and(np.logical_not(responses),np.logical_not(ground_truth))
    false_positives = np.logical_and(responses,np.logical_not(ground_truth))
    false_negatives = np.logical_and(np.logical_not(responses),ground_truth)
    
    num_true_positives = np.count_nonzero(true_positives)
    num_true_negatives = np.count_nonzero(true_negatives)
    num_false_positive = np.count_nonzero(false_positives)
    num_false_negatives = np.count_nonzero(false_negatives)
    
    confusion_matrix = np.array([
        [num_true_positives,num_false_positives],
        [num_true_negatives,num_false_negatives]
    ])
    

    I'm not sure if that's the correct convention for the confusion matrix, but you can rearrange it in your own code

    P.S.:

    You can also use sklearn: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html