Search code examples
rkerasconfusion-matrix

Calculating True/False Positive and True/False Negative Values from Matrix in R


For subsequent discussion, I am using the keras package in R.

Given a confusion matrix created as follows:

# Get confusion matrix for predictions
classes <- model %>% predict_classes(test, batch_size=128)
ct <- table(test.target, classes)
cm <- as.matrix(ct)

For which ct gives the following confusion matrix:

           classes
test.target   0   1   2
          0 805 192   0
          1  74 862   0
          2   2   0 477

How can I calculate the True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN) values?

For clarification, I calculate the True Positive (TP) value by getting the diagonal of the matrix:

tp <- diag(cm)

However, my attempt of calculating the FP value gives me negative numbers (which I guess cant be right, correct?):

# Get false positive rates (FP)
fp <- c()
for(i in seq_len(ncol(ct))) {
  fp <- append(fp, sum(cm[,i])-cm[i,i])
}

EDIT: The dput(cm) is as follows:

structure(c(805L, 74L, 2L, 192L, 862L, 0L, 0L, 0L, 477L), .Dim = c(3L, 
3L), .Dimnames = list(test.target = c("0", "1", "2"), classes = c("0", 
"1", "2")), class = "table")

Solution

  • This issue has actually been dealt with several times on Stack Overflow (e.g. here and here and here), but never as far as I found in the context of R, so I think it's safe not to count as duplicate.

    The true positives are, as you state, the diagonal elements. The false positives that you had trouble with are as follows: false positives for class i are the sum of cells in column i but not row i.

    False negatives are defined analogously: false negatives for class i are the sum of cells in row i but not column i.

    Then the true negatives for class i are all the elements that are not in row or column i.

    We can calculate as follows:

    true_positives  <- diag(cm)
      0   1   2 
    805 862 477 
    false_positives <- colSums(cm) - true_positives
      0   1   2 
     76 192   0 
    false_negatives <- rowSums(cm) - true_positives
      0   1   2 
    192  74   2 
    true_negatives  <- sum(cm) - true_positives - false_positives - false_negatives
       0    1    2 
    1339 1284 1933 
    

    You could even make a function to reuse for later:

    multi_class_rates <- function(confusion_matrix) {
        true_positives  <- diag(confusion_matrix)
        false_positives <- colSums(confusion_matrix) - true_positives
        false_negatives <- rowSums(confusion_matrix) - true_positives
        true_negatives  <- sum(confusion_matrix) - true_positives -
            false_positives - false_negatives
        return(data.frame(true_positives, false_positives, true_negatives,
                          false_negatives, row.names = names(true_positives)))
    }
    
    multi_class_rates(cm)
      true_positives false_positives true_negatives false_negatives
    0            805              76           1339             192
    1            862             192           1284              74
    2            477               0           1933               2
    

    (You might want to make the class a variable rather than row names)