Search code examples
javaxmljsonclassificationweka

Exporting Weka DecisionTree from Java API to XML or JSON


I have been tasked with enhancing an existing Weka system in Java by adding an export of the decision tree for consumption by offline components (ideally in JSON format, but XML could work too).

Let me warn you that I'm quite new to Weka :)

I haven’t found a way to get direct access to the J48’s Root Tree (appears to be private w/in the class)—are you aware of a way to get at it? If not, the closest I’ve found as a way to get at the data bit hackish: use J48.toString() to dump the tree-as-string, and then convert that back into a Tree Structure and then convert that into a JSON string (YUK).

It seems that this use-case is not unusual, so I'm wondering if any of you all have already solved this problem . . . any direction/suggestion is appreciated.

Thanks!


Solution

  • The graph() method in ClassifierTree gives the Graphviz representation of the decision tree in a "dot" file.

    If we take this example, then the code

    J48 g = (J48) models[0]; 
    System.out.println(g.graph());
    

    will return:

    digraph J48Tree {
    N0 [label="outlook" ]
    N0->N1 [label="= sunny"]
    N1 [label="humidity" ]
    N1->N2 [label="<= 75"]
    N2 [label="yes (2.0)" shape=box style=filled ]
    N1->N3 [label="> 75"]
    N3 [label="no (3.0)" shape=box style=filled ]
    N0->N4 [label="= overcast"]
    N4 [label="yes (4.0)" shape=box style=filled ]
    N0->N5 [label="= rainy"]
    N5 [label="yes (4.0/1.0)" shape=box style=filled ]
    }
    

    corresponding to this tree:

    enter image description here

    To answer your question,

    I haven’t found a way to get direct access to the J48’s Root Tree (appears to be private w/in the class)—are you aware of a way to get at it?

    You could extend J48 like below and declare you classifier as a MyJ48 instead of J48:

    class MyJ48 extends J48{
    
        public ClassifierTree getGraph(){
            return m_root;
        }
    
    }
    

    That enables to access the ClassifierTree using the getGraph() method. Based on that, you can mimic the graph() method of the ClassifierTree class (see here) to generate your json.

    I hope it helps.