Search code examples
classificationwekadecision-tree

How to read the classifier confusion matrix in WEKA


Sorry, I am new to WEKA and just learning.

In my decision tree (J48) classifier output, there is a confusion Matrix:

a    b   <----- classified as
130  8     a = functional
15   150   b = non-functional
  • How do I read this matrix? What's the difference between a & b?
  • Also, can anyone explain to me what domain values are?

Solution

  • I'd put it this way:

    The confusion matrix is Weka reporting on how good this J48 model is in terms of what it gets right, and what it gets wrong.

    In your data, the target variable was either "functional" or "non-functional;" the right side of the matrix tells you that column "a" is functional, and "b" is non-functional.

    The columns tell you how your model classified your samples - it's what the model predicted:

    • The first column contains all the samples which your model thinks are "a" - 145 of them, total
    • The second column contains all the samples which your model thinks are "b" - 158 of them

    The rows, on the other hand, represent reality:

    • The first row contains all the samples which really are "a" - 138 of them, total
    • The second row contains all the samples which really are "b" - 165 of them

    Knowing the columns and rows, you can dig into the details:

    • Top left, 130, are things your model thinks are "a" which really are "a" <- these were correct
    • Bottom left, 15, are things your model thinks are "a" but which are really "b" <- one kind of error
    • Top right, 8, are things your model thinks are "b" but which really are "a" <- another kind of error
    • Bottom right, 150 are things your model thinks are "b" which really are "b"

    So top-left and bottom-right of the matrix are showing things your model gets right.

    Bottom-left and top-right of the matrix are are showing where your model is confused.