I have the following points in 3D space:
I need to group the points, according to D_max
and d_max
:
D_max = max dimension of each group
d_max = max distance of points inside each group
Like this:
The shape of the group in the above image looks like a box, but the shape can be anything which would be the output of the grouping algorithm.
I'm using Python and visualize the results with Blender. I'm considering using the scipy.spatial.KDTree and calling its query API, however, I'm not sure if that's the right tool for the job at hand. I'm worried that there might be a better tool which I'm not aware of. I'm curious to know if there is any other tool/library/algorithm which can help me.
As @CoMartel pointed out, there is DBSCAN and also HDBSCAN clustering modules which look like a good fit for this type of problems. However, as pointed out by @Paul they lack the option for max size of the cluster which correlates to my D_max
parameter. I'm not sure how to add a max cluster size feature to DBSCAN and HDBSCAN clustering.
Thanks to @Anony-Mousse I watched Agglomerative Clustering: how it works and Hierarchical Clustering 3: single-link vs. complete-link and I'm studying Comparing Python Clustering Algorithms, I feel like it's getting more clear how these algorithms work.
As requested, my comment as an answer :
You could use DBSCAN(http://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html) or HDBSCAN.
Both these algorithm allow to group each point according to d_max (maximum distance between 2 points of the same dataset), but they don't take the maximum cluster size. The only way to limit the maximum size of a cluster is by reducing the eps
parameter, which control the max distance between 2 points of the same cluster.