Search code examples
pandasanacondajupyter-notebookgraphvizdecision-tree

How to show Feature Names in Graphviz?


I'm building a tree in Graphviz and I can't seem to be able to get the feature names to show up, I have defined a list with the feature names like so:

names = list(df.columns.values)

Which prints:

    ['Gender',
 'SuperStrength',
 'Mask',
 'Cape',
 'Tie',
 'Bald',
 'Pointy Ears',
 'Smokes']

So the list is being created, later I build the tree like so:

 export_graphviz(tree, out_file=ddata, filled=True, rounded=True, special_characters=False, impurity=False, feature_names=names)

But the final image still has the feature names listed like X[]: enter image description here

How can I get the actual feature names to show up? (Cape instead of X[3], etc.)


Solution

  • I can only imagine this has to do with passing the names as an array of the values. It works fine if you pass the columns directly:

    export_graphviz(tree, out_file=ddata, filled=True, rounded=True, special_characters=False, impurity=False, feature_names=df.columns)
    

    If needed, you can also slice the columns:

    export_graphviz(tree, out_file=ddata, filled=True, rounded=True, special_characters=False, impurity=False, feature_names=df.columns[5:])