Search code examples
wekadecision-treej48

Same decision tree, different results


I work on a machine learning application and use Weka for testing, comparison classification algorithms etc. After the test operations on Weka, I determined to use J48 decision tree. I parsed the pruned tree which Weka had produced and implemented it as if-then format in C. However, if I tested my data which had been used as input for Weka in my program, results are not same as confusion matrix of Weka. In test options of Weka, I selected "Use training set" and I used that decision tree. Here is the confusion matrix and my results:

=== Confusion Matrix ===

    a    b    c    d    e    f    g   <-- classified as
  178    1    0    1   13    2    7 |    a = InstantMessaging
    4   29   11    1   14   46   25 |    b = Mail
    1    3 1051    4   32  921   54 |    c = Music
    4    0   14 9596   10    4   10 |    d = P2P
   10    1   46    6  607  263   59 |    e = SocialMedia
    4    1  230    2   44 7619   63 |    f = VideoStream
    5    0   57    1   57  167 1016 |    g = WebBrowsing

My results from program:

 "instantMessaging" => 210,
             "mail" => 33,
            "music" => 4933,
              "p2p" => 9886,
      "socialMedia" => 1220,
      "videoStream" => 4958,
      "webBrowsing" => 1054,
            "total" => 22294,

Although everything is same (decision tree, data, feature values, functions etc.), why do I get these different result? Is there a such possibility that Weka producing/showing wrong decision tree?


Solution

  • With more deeply search, I have found the answer. The problem was originated by changed function which create a feature. Since this function was changed, result of feature in the feature set was not equals to arff file. All results are logical now.