Search code examples
machine-learningcluster-analysisdata-mininghierarchical-clusteringelki

ELKI hierarchical clustering - "mrg_" Cluster object


I'm using ELKI's SimplifiedHierarchyExtraction with AnderbergHierarchicalClustering, LatLngDistanceFunction and minClSize = 100.

I saw that beside the "clu_" Clusters there are also 2 -3 "mrg_" Clusters which have some DBID's, but the number of it is < minClSize.

My question is: what is the best way to handle this "mrg_" Clusters?:

  • passing its DBID´s to one of its "clu_" children?
  • taking them as a cluster although they are under the minClSize?
  • simply ignoring them?

Solution

  • This is a hierarchical result.

    You need to include all child clusters into a cluster.

    So the mrg_ cluster has some (potentially 0) new objects, plus all those objects in child clusters. In particular, it can have more than one child cluster (that is why it is called merge)