c++c++14 cluster-analysis k-means point-cloud-library

pcl::EuclideanClusterExtraction vs pcl::Kmeans

In the PCL tutorial, we can learn how to segment a plane and extract the Euclidean cluster point clouds. So now, after I used the pcl::EuclideanClusterExtraction algorithm. I need the centroid or the mean position of each cluster.

Using pcl::EuclideanClusterExtraction I need to calculate the centroid with the for loops. After my search I found out the pcl::Kmeans which provides directly a function get_centroids() to get the centroids of the clusters: https://pointclouds.org/documentation/classpcl_1_1_kmeans.html#a8788bd4098ea370e018119fc516a5eb4

Now, I'm a little bit confused. What is the real application different between pcl::EuclideanClusterExtraction and pcl::Kmeans? After analysing the source code, pcl::EuclideanClusterExtraction provides us clusters based on three parameters. pcl::Kmeans is used if we determine how many clusters we want to generate, because of the arguments in the constructor Kmeans (unsigned int num_points, unsigned int num_dimensions).

Is that true? Is there any other cases?

Solution

These are two very different algorithms:

k-means clustering iteratively finds spherical clusters of points (usually high-dimensional), where cluster affinity is based on a distance to the cluster center. From a math point of view: it chooses centroids that will minimize the squared distances of cluster points to the centroid - and as mentioned, each points belongs to the cluster with the nearest centroid.
cluster_extraction is a greedy region growing algorithm based on nearest neighbors. Cluster affinity is based on a distance to any point of a cluster (cluster tolerance parameter).

cluster extraction with tolerance larger than distance among blue/black dots, but smaller than distance between black and blue dots:

k-means with k=2