Search code examples
pythoncluster-analysisseabornhierarchical-clustering

Seaborn clustermap: what's the main argument, observations or distances?


What's the data argument for seaborn clustermap.

Does it take a matrix where each cell is the distance between the vectors of the original matrix with observations? Or the clustermap calculates the distance itself so that I need to pass the observation matrix?

In the first case, what the argument metric is there for? Is there to indicate which metric has been used to calculate the distances?


Solution

  • Obviously - look at the ''examples'' on the very page you linked - it expects a data table, and a metric to use for computing distances.

    As the documentation of clustermap clearly states, it uses scipy.spatial.distance.pdist to compute pairwise distances.

    I do not seen an option to use a precomputed distance matrix, although it may be possible to pass a custom function that could do a matrix lookup.