Search code examples
graphvizlightgbm

I fails to plot_tree() when I use lightgbm


I fails to plot_tree when I use lightgbm.

I upload traceback as an image, because I cannot download my code from server.

How can I solve this problem?

enter image description here


Solution

  • I've faced a similar issue in the past. If your aim is to investigate which features each tree splits on and which value it splits on you can use an alternative approach:

    After instantiating a LightGBM classifier and fitting it to the training data you can then use the trees_to_dataframe() method of the booster of the classifer. This retrieves the data in a relatively easy-to-read format. Even though it doesn't create a plot, the resulting dataframe can be used to investigate the tree in more detail.

    Here is an example using the load_breast_cancer dataset from sklearn

    from lightgbm import LGBMClassifier
    from sklearn.datasets import load_breast_cancer
    
    data = load_breast_cancer()
    X, y = data['data'], data['target']
    X = pd.DataFrame(X, columns=data['feature_names'])
    
    clf = LGBMClassifier()
    clf.fit(X, y)
    
    clf.booster_.trees_to_dataframe()
    

    Output:

    tree_index node_depth node_index left_child right_child parent_index split_feature split_gain threshold decision_type missing_direction missing_type value weight count
    0 0 1 0-S0 0-S1 0-S3 None worst_area 3.925050e+02 868.2000 <= left None 0.521150 0.000000 569
    1 0 2 0-S1 0-S4 0-S2 0-S0 worst_concave_points 6.169360e+01 0.1358 <= left None 0.641339 89.298200 382
    2 0 3 0-S4 0-S5 0-L5 0-S1 area_error 1.586540e+00 34.7300 <= left None 0.672849 78.077500 334
    3 0 4 0-S5 0-S8 0-S6 0-S4 worst_texture 7.523920e-01 30.0800 <= left None 0.676446 73.402200 314
    4 0 5 0-S8 0-L0 0-L9 0-S5 mean_radius 2.842170e-14 13.4950 <= left None 0.680533 63.116600 270
    ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
    5589 99 4 99-S24 99-L18 99-S27 99-S17 mean_texture 2.184820e-09 15.0450 <= left None -0.100054 0.013308 123
    5590 99 5 99-L18 None None 99-S24 None NaN NaN None None None -0.100125 0.003284 3
    5591 99 5 99-S27 99-L25 99-L28 99-S24 worst_radius 6.243950e-10 16.3200 <= left None -0.100031 0.010024 120
    5592 99 6 99-L25 None None 99-S27 None NaN NaN None None None -0.100063 0.003757 9
    5593 99 6 99-L28 None None 99-S27 None NaN NaN None None None -0.100012 0.006267 111