Search code examples
rmachine-learningwekarweka

Are there RWeka's Detailed Accuracy By Class?


In Weka 3.8.3 (a machine learning platform), the results of an analysis using JRip classifier are in the following form.

=== Summary ===

Correctly Classified Instances         158               25.2396 %
Incorrectly Classified Instances       468               74.7604 %
Kappa statistic                          0.0004
Mean absolute error                      0.3743
Root mean squared error                  0.4365
Relative absolute error                 99.7998 %
Root relative squared error            100.7977 %
Total Number of Instances              626    

=== Detailed Accuracy By Class ===

               TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
                 0.166     0.162      0.255     0.166     0.201      0.504    A
                 0         0          0         0         0          0.464    B
                 0.006     0.009      0.2       0.006     0.012      0.526    C
                 0.829     0.829      0.252     0.829     0.387      0.499    D
Weighted Avg.    0.252     0.252      0.177     0.252     0.151      0.498

=== Confusion Matrix ===

   a   b   c   d   <-- classified as
  26   0   1 130 |   a = A
  31   0   1 123 |   b = B
  20   0   1 135 |   c = C
  25   0   2 131 |   d = D

With RWeka 0.4-40 (Weka for R), the same kind of analysis produces results in the following form.

=== Summary ===

Correctly Classified Instances         203               32.4281 %
Incorrectly Classified Instances       423               67.5719 %
Kappa statistic                          0.0966
Mean absolute error                      0.3605
Root mean squared error                  0.4246
Relative absolute error                 96.1482 %
Root relative squared error             98.0552 %
Total Number of Instances              626     

=== Confusion Matrix ===

   a   b   c   d   <-- classified as
  41   0   3 113 |   a = A
  12   0   5 138 |   b = B
   7   0  23 126 |   c = C
   9   0  10 139 |   d = D

Where's the data of the "Detailed Accuracy By Class" section (the second section from the original Weka results)? I've tried

library(RWeka)
library(caret)
TrainData <- p[,2:211]
TrainClasses <- p[,215]
jripFit <- train(TrainData,TrainClasses,method='JRip')
jripFit
summary(jripFit)
str(jripFit)

but it's nowhere to be found. Obviously, p is my data.frame, and 215th column is the classifier.


Solution

  • There's no need to use the caret package.

    library(RWeka)
    jripFit <- JRip(myClass ~ ., data = p[,c(2:211,215)])
    summary(jripFit,class=T)
    

    where myClass is the name of the 215th column of p, which must be a factor (while columns 2 to 211 are numeric).

    Another point of interest: JRip function was 663 (!) times faster than the train function, at least on my machine.