Search code examples
rdataframeconfusion-matrix

Create a confusion matrix from a dataframe


I have this data frame called conf_mat with two columns including predicted values and reference values in each objects. I have 20 objects in this dataframe.

 dput(Conf_mat)
structure(list(Predicted = c(100, 200, 200, 100, 100, 200, 200, 
200, 100, 200, 500, 100, 100, 100, 100, 100, 100, 100, 500, 200
), Reference = c(600, 200, 200, 200, 200, 200, 200, 200, 500, 
500, 500, 200, 200, 200, 200, 200, 200, 200, 200, 200)), .Names = c("Predicted", 
"Reference"), row.names = c(NA, 20L), class = "data.frame")

I want to create a confusion matrix out of this table with this kind of structure which will be filled in by the Conf_mat dataframe. This will be allow me to compute an accuracu assessment of my classification. Thanks for your help.

    100 200 300 400 500 600
100  NA  NA  NA  NA  NA  NA
200  NA  NA  NA  NA  NA  NA
300  NA  NA  NA  NA  NA  NA
400  NA  NA  NA  NA  NA  NA
500  NA  NA  NA  NA  NA  NA
600  NA  NA  NA  NA  NA  NA

Solution

  • 1) Try the following:

    table(Conf_mat)
    

    2) If you want to force levels 100, 200, ..., 600 to appear:

    conf_mat_tab <- table(lapply(Conf_mat, factor, levels = seq(100, 600, 100)))
    

    3) You could also try this:

    library(caret)
    confusionMatrix(conf_mat_tab) # conf_mat_tab from (2)
    

    which gives:

    Confusion Matrix and Statistics
    
             Reference
    Predicted 100 200 300 400 500 600
          100   0   9   0   0   1   1
          200   0   6   0   0   1   0
          300   0   0   0   0   0   0
          400   0   0   0   0   0   0
          500   0   1   0   0   1   0
          600   0   0   0   0   0   0
    
    Overall Statistics
    
                   Accuracy : 0.35            
                     95% CI : (0.1539, 0.5922)
        No Information Rate : 0.8             
        P-Value [Acc > NIR] : 1               
    
                      Kappa : 0.078           
     Mcnemar's Test P-Value : NA              
    
    Statistics by Class:
    
                         Class: 100 Class: 200 Class: 300 Class: 400 Class: 500 Class: 600
    Sensitivity                  NA     0.3750         NA         NA     0.3333       0.00
    Specificity                0.45     0.7500          1          1     0.9412       1.00
    Pos Pred Value               NA     0.8571         NA         NA     0.5000        NaN
    Neg Pred Value               NA     0.2308         NA         NA     0.8889       0.95
    Prevalence                 0.00     0.8000          0          0     0.1500       0.05
    Detection Rate             0.00     0.3000          0          0     0.0500       0.00
    Detection Prevalence       0.55     0.3500          0          0     0.1000       0.00
    Balanced Accuracy            NA     0.5625         NA         NA     0.6373       0.50