Search code examples
scikit-learnvisualizationgoogle-colaboratorydecision-treedtreeviz

Visualise a decision tree in Colaboratory


What is the best way to visualise a decision tree using Google Colab? The visualisations from 'dtreeviz' (e.g.Github) are really neat, but when running something like

!pip install dtreeviz

and

from sklearn.datasets import *
from sklearn import tree
from dtreeviz.trees import *

followed by

classifier = tree.DecisionTreeClassifier(max_depth=4)
cancer = load_breast_cancer()
classifier.fit(cancer.data, cancer.target)
viz = dtreeviz(classifier,
              cancer.data,
              cancer.target,
              target_name='cancer',
              feature_names=cancer.feature_names, 
              class_names=["malignant", "benign"],
              fancy=False )  

viz.view()

I get

ExecutableNotFound: failed to execute ['dot', '-Tsvg', '-o', '/tmp/DTreeViz_62.svg', '/tmp/DTreeViz_62'], make sure the Graphviz executables are on your systems' PATH

Which could have something to do which Colab running via my g-drive?

Any help appreciated!


Solution

  • Short answer

    • Make sure graphviz is installed via !apt-get install graphviz
    • You can get the created SVG via viz.svg()
    • Wrap the output in IPython's HTML and then call display to show it in your notebook

      from IPython.core.display import display, HTML
      display(HTML(viz.svg()))
      

    Longer answer

    • dtreeviz view() creates a SVG file in your temp directory
    • This file gets passed to the graphviz library which opens it depending on your OS
    • Google colab is recognized as linux and it tries to open the SVG file via the default viewing application
    • The last step leads to nowhere if you are not running the notebook locally (probably the Google server now has either a couple of open SVG images or some error messages)
    • The code from the short answer just gets the SVG code without saving it and then displays it in the notebook