Search code examples
matlabmatlab-figurehierarchical-clusteringlinkagedendrogram

Wrong dendrogram generated


I have 31 data, but dendrogram is missing one data. Here is my code:

A = csvread('similarityNoGrpS2.csv',1,1) % 31x31 double
Z = linkage(A, 'average') % 30x3 double
H = dendrogram(Z,'Orientation','left','ColorThreshold','default') %29x1 line

My input file can be found here.

Here is my dendrogram:

output

According to Z, (24,30) and (27,31) should be clustered, but in dendrogram pic, we can see there is no 31 and 27 is getting clustered with 30 which is wrong!

Can anyone help me in this matter?

P.S. I'm using MATLAB R2016a.


Solution

  • You need to modify the last line of your code to this:

    H = dendrogram(Z, 0, 'Orientation', 'left', 'ColorThreshold', 'default');
    

    which for the given data gives:

    enter image description here


    Explanation

    Your original data set (A) has more than 30 points but you did not specify the value of P. It is mentioned in the documentation:

    If you do not specify P then dendrogram uses 30 as the maximum number of leaf nodes. To display the complete tree, set P equal to 0.

    So you need to put P=0 in this syntax:

    dendrogram(tree,P,Name,Value)