I am trying to learn about WEKA J48 decision tree from the cardiology-weka.arff .
I have run the output as below,
Test mode:10-fold cross-validation
=== Classifier model (full training set) ===
J48 pruned tree
------------------
thal = Rev
| chest-pain-type = Asymptomatic: Sick (79.0/7.0)
| chest-pain-type = AbnormalAngina
| | #colored-vessels = 0
| | | peak <= 0.1: Healthy (4.0)
| | | peak > 0.1: Sick (3.0/1.0)
| | #colored-vessels = 1: Sick (2.0)
| | #colored-vessels = 2: Healthy (0.0)
| | #colored-vessels = 3: Healthy (0.0)
| chest-pain-type = Angina
| | cholesterol <= 229: Healthy (3.0)
| | cholesterol > 229
| | | age <= 48: Sick (2.0)
| | | age > 48: Healthy (3.0/1.0)
| chest-pain-type = NoTang
| | slope = Flat
| | | #colored-vessels = 0
| | | | blood-pressure <= 122: Healthy (3.0)
| | | | blood-pressure > 122: Sick (3.0)
| | | #colored-vessels = 1: Sick (5.0)
| | | #colored-vessels = 2: Sick (0.0)
| | | #colored-vessels = 3: Sick (3.0/1.0)
| | slope = Up: Healthy (7.0/1.0)
| | slope = Down: Healthy (1.0)
thal = Normal
| #colored-vessels = 0: Healthy (118.0/12.0)
| #colored-vessels = 1
| | sex = Male
| | | chest-pain-type = Asymptomatic: Sick (9.0)
| | | chest-pain-type = AbnormalAngina: Sick (2.0/1.0)
| | | chest-pain-type = Angina: Healthy (3.0/1.0)
| | | chest-pain-type = NoTang: Healthy (2.0)
| | sex = Female: Healthy (13.0/1.0)
| #colored-vessels = 2
| | angina = TRUE: Sick (3.0)
| | angina = FALSE
| | | age <= 62
| | | | age <= 53: Healthy (2.0)
| | | | age > 53: Sick (4.0)
| | | age > 62: Healthy (5.0)
| #colored-vessels = 3: Sick (6.0/1.0)
thal = Fix
| #colored-vessels = 0
| | angina = TRUE: Sick (3.0/1.0)
| | angina = FALSE: Healthy (5.0)
| #colored-vessels = 1: Sick (4.0)
| #colored-vessels = 2: Sick (4.0)
| #colored-vessels = 3: Sick (2.0)
Number of Leaves : 32
Size of the tree : 49
Time taken to build model: 0.03 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 222 73.2673 %
Incorrectly Classified Instances 81 26.7327 %
Kappa statistic 0.4601
Mean absolute error 0.3067
Root mean squared error 0.4661
Relative absolute error 61.8185 %
Root relative squared error 93.5807 %
Total Number of Instances 303
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.696 0.236 0.711 0.696 0.703 0.756 Sick
0.764 0.304 0.75 0.764 0.757 0.756 Healthy
Weighted Avg. 0.733 0.273 0.732 0.733 0.732 0.756
=== Confusion Matrix ===
a b <-- classified as
96 42 | a = Sick
39 126 | b = Healthy
The question were
So far i can only interpret the confusion matrix about correctly classified class. Any help would be appreciated.
The topmost node is "thal", it has three distinct levels.
You can draw the tree as a diagram within weka by using "visualize tree" . On the model outcomes, left-click or right click on the item that says "J48 - 20151206 10:33" (or something similar). Try it for yourself or search for my answer in which I have provided screenshots (on how to do this)
YOu can constrain the tree by "Pruning" it to n levels in the J48 configuration dialog.