Search code examples
pythonrandom-forestgraphvizdecision-tree

Export / Plot Random Forest Decision Tree / 'RandomForestClassifier' object has no attribute 'tree_'


Good evening to all

The objective from this post is to be able to plot the decision tree from the random decision tree process. After running the different options I always got the next error: 'RandomForestClassifier' object has no attribute 'tree_'

Really appreciate any help / code examples / ideas or links in oder to be able to solve this situation.

On the next set of code how I was able to plot the regular / normal decision tree.

clf_SMOTE1 = DecisionTreeClassifier(criterion='entropy',max_depth=4, min_samples_leaf=7)
clf_SMOTE1
    
clf_SMOTE1 = clf_SMOTE1.fit(X_train, Y_train)
    
a = df.columns[6:]
dot_data = tree.export_graphviz(clf_SMOTE1, out_file=None, filled=False, feature_names= a)
graphviz.Source(dot_data)

On the next lines the different attempts that I have tried with no results.

clf_SMOTE2 = RandomForestClassifier(criterion='entropy', bootstrap = True, max_depth=4, min_samples_leaf=7)
clf_SMOTE2
    
clf_SMOTE2 = clf_SMOTE2.fit(X_train, Y_train)

a = df.columns[6:]
dot_data_2 = tree.export_graphviz (clf_SMOTE2, out_file=None, feature_names = a, precision = 2, filled = False)
graphviz.Source(dot_data_2)

Option 2:

clf_SMOTE2 = RandomForestClassifier(criterion='entropy', bootstrap = True, max_depth=4, min_samples_leaf=7)
clf_SMOTE2
        
clf_SMOTE2 = clf_SMOTE2.fit(X_train, Y_train)
    
    
a = df.columns[6:]
dot_data_2 = tree.plot_tree(clf_SMOTE2,  model.estimators_[5], feature_names= a)
graphviz.Source(dot_data_2)

Solution

  • From the help page:

    A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset

    So you cannot apply export_graphviz on RandomForestClassifier object. You need to access one of the decision trees stored under estimators_ :

    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import make_classification
    from sklearn.tree import export_graphviz
    import pandas as pd
    
    X,y = make_classification()
    X = pd.DataFrame(X,columns = ['col'+str(i) for i in range(X.shape[1])])
    
    clf = RandomForestClassifier(criterion='entropy', bootstrap = True, 
    max_depth=4, min_samples_leaf=7)
    clf = clf.fit(X, y)
    

    Here we just access the first decision tree, out of the 100 :

    len(clf.estimators_)
    100
    
    dot_data_2 = export_graphviz(clf.estimators_[0], out_file=None, 
    feature_names = X.columns, precision = 2, filled = False)