It is possible to visualize decision trees using pydotplus
from pypi, but it has issues on my machine (it says it was not build with libexpat and thus it only shows a number on a node instead of a table with some information) and I'd like to use an alternative. I already tried using networkx
, but it requires pygraphviz
to read .dot files and make a networkx graph of them. When I tried to install it using pip that also failed.
So now I am looking for an alternative way of visualizing decision trees, which can be installed using pip or anaconda.
Which alternatives exist?
SciPy version: 0.17.0
digraph Tree {
node [shape=box, style="filled", color="black"] ;
0 [label="grade.B <= 0.5\ngini = 0.5\nsamples = 37224\nvalue = [18476, 18748]", fillcolor="#399de504"] ;
1 [label="grade.C <= 0.5\ngini = 0.4973\nsamples = 32094\nvalue = [17218, 14876]", fillcolor="#e5813923"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="gini = 0.4829\nsamples = 21728\nvalue = [12875, 8853]", fillcolor="#e5813950"] ;
1 -> 2 ;
3 [label="gini = 0.4869\nsamples = 10366\nvalue = [4343, 6023]", fillcolor="#399de547"] ;
1 -> 3 ;
4 [label="grade.A <= 14.8301\ngini = 0.3702\nsamples = 5130\nvalue = [1258, 3872]", fillcolor="#399de5ac"] ;
0 -> 4 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
5 [label="gini = 0.3555\nsamples = 4987\nvalue = [1153, 3834]", fillcolor="#399de5b2"] ;
4 -> 5 ;
6 [label="gini = 0.3902\nsamples = 143\nvalue = [105, 38]", fillcolor="#e58139a3"] ;
4 -> 6 ;
I programmed this in a Jupyter notebook, but that has a bug of not coloring the svg if you try to display the SVG using:

I found a work-around here:
from IPython.display import HTML
svg = None
with open('dtree.svg') as svg_file:
svg =
It's not the sexiest solution but I use the Grapviz CLI (it's called dot
) called via subprocess
, I'm on Mac, so I installed it with homebrew, but you can download binaries for other platforms from their downloads page. Here's an example using the Titanic datset:
import pandas as pd
import subprocess
import seaborn.apionly as sns
fromwd sklearn.preprocessing import Imputer
from sklearn.tree import DecisionTreeClassifier, export_graphviz
raw_data = sns.load_dataset('titanic')
predictors = ['pclass','sex','age','sibsp','parch','fare','embarked','alone','adult_male']
categorical = ['sex','embarked']
numeric = [c for c in predictors if c not in categorical]
encoded_data = pd.get_dummies(raw_data[predictors], columns=categorical)
imputer = Imputer()
X = imputer.fit_transform(encoded_data).astype('float32')
Y = raw_data[target].astype('float32')
model = DecisionTreeClassifier(min_samples_leaf=10, max_depth=3), Y)
impurity=False)['dot', '-Tpdf', '', '-o' 'tree.pdf'])