Search code examples
matlabplotcluster-analysishierarchical-clusteringdendrogram

How to display separate disconnected trees in MATLAB when doing hierarchical clustering and producing dendrograms?


I am working with MATLAB, and I have an adjacency matrix:

mat =

 0     1     0     0     0     0
 1     0     0     0     1     0
 0     0     0     1     0     0
 0     0     1     0     0     1
 0     1     0     0     0     0
 0     0     0     1     0     0

which is not fully connected. Nodes {1,2,5} are connected, and {3,4,6} are connected (the edges are directed).

I would like to see the separate clusters in a dendrogram on a single plot. Since there is not path from one cluster to the next, I would like to see separate trees with separate roots for each cluster. I am using the commands:

mat=zeros(6,6)
mat(1,2)=1;mat(2,1)=1;mat(5,2)=1;mat(2,5)=1;
mat(6,4)=1;mat(4,6)=1;mat(3,4)=1;mat(4,3)=1;
Y=pdist(mat)
squareform(Y)
Z=linkage(Y)
figure()
dendrogram(Z)

These commands are advised from Hierarchical Clustering. And the result is attached: imageDendrogram. Other than that the labels don't make sense, the whole tree is connected, and I connect figure out how to have several disconnected trees which reflect the disconnected nature of the data. I would like to avoid multiple plots as I wish to work with larger datasets that may have many disjoint clusters.


Solution

  • I see this was asked a while ago, but in case you're still interested, here's something to try:

    First extract the values above the diagonal from the adjacency matrix, like so:

    >> matY = [];
    >> for n = 2:6
    for m = n:6
    matY = [matY mat(n,m)];
    end
    end
    >> matY
    
    matY =
    
      Columns 1 through 13
    
         0     0     0     1     0     0     1     0     0     0     0     1     0
    
      Columns 14 through 15
    
         0     0
    

    Now you have something that looks like the Y vector pdist would have produced. But the values here are the opposite of what you probably want; the unconnected vertices have a "distance" of zero and the connected ones are one apart. Let's fix that:

    >> matY(matY == 0) = 10
    
    matY =
    
      Columns 1 through 13
    
        10    10    10     1    10    10     1    10    10    10    10     1    10
    
      Columns 14 through 15
    
        10    10
    

    Better. Now we can compute a regular cluster tree, which will represent connected vertices as "close together" and non-connected ones as "far apart":

    >> linkage(matY)
    
    ans =
    
         3     6     1
         1     5     1
         2     4     1
         7     8    10
         9    10    10
    
    >> dendrogram(ans)
    

    The resulting diagram:

    adjacency dendrogram

    Hope this is a decent approximation of what you're looking for.