I am using hierarchical clustering algorithm to cluster my dataset with different number of clusters. For instance,
a= [1;2;3;4;5;20;21;22;28;29]
Z=linkage(a,'ward')
[clusterIndexes]=cluster(Z,'maxclust',2)
this snippets clusters data into two at where first cluster holds 1,2,3,4,5. Lets call this cluster A and the second one holds 20,21,22,28,29 which is cluster B.
When I run the following script and cluster data into 3
a= [1;2;3;4;5;20;21;22;28;29]
Z=linkage(a,'ward')
[clusterIndexes]=cluster(Z,'maxclust',3)
It gives me clusters of (1 2 3 4 5)= cluster X,(20,21,22) = cluster Y,(28,29) = cluster Z.
How can I demonstrate programmatically that cluster B splitted into Cluster Y and Cluster Z?
Sorry for the naive question I am very new to matlab.
You can use setxor
to identify any differences between the two clusters. User union
to merge clusterY
and clusterZ
when comparing to clusterB
. Since the result is empty the two clusters contain the same set of numbers. If there were any differences between the two it would be output by setxor
.
clusterB = [20 21 22 28 29];
clusterY = [20 21 22];
clusterZ = [28 29];
setxor(clusterB, union(clusterY, clusterZ))
ans =
1×0 empty double row vector
Suppose for example clusterB
had an additional number, you can see the results below.
clusterB = [5 20 21 22 28 29];
setxor(clusterB, union(clusterY, [clusterZ]))
ans =
5