Search code examples
pythoncluster-analysisplotly-dashhierarchical-clustering

HDBSCAN dendrogram with Plotly, Python


Creating a dendrogram using plotly.figure_factory.create_dendrogram has been discussed. I decided to use HDBSCAN as custering algorithm and would like to visualize the clusters with Plotly.

clusterer = hdbscan.HDBSCAN(
    algorithm ='best', 
    alpha = 1.0, 
    approx_min_span_tree = False,
    gen_min_span_tree = True,
    metric = 'hamming', 
    min_cluster_size = 2, 
    min_samples = 10,
    allow_single_cluster = True, 
    p = None)
clusters = clusterer.fit_predict(df_matrix)

How can I extract a dendrogram out of the code above? Thanks!


Solution

  • You can use

    clusterer.condensed_tree_.to_pandas()

    to store the tree in a tree shaped manner within a pandas dataframe. There are more methods available, also for visualizing. See the docs: https://hdbscan.readthedocs.io/en/latest/advanced_hdbscan.html