I have applied DBSCAN to perform clustering on a dataset consisting of X, Y and Z coordinates of each point in a point cloud. I want to plot only the clusters which have less than 100 points. This is what I have so far:
clustering = DBSCAN(eps=0.1, min_samples=20, metric='euclidean').fit(only_xy)
plt.scatter(only_xy[:, 0], only_xy[:, 1],
c=clustering.labels_, cmap='rainbow')
clusters = clustering.components_
#Store the labels
labels = clustering.labels_
#Then get the frequency count of the non-negative labels
counts = np.bincount(labels[labels>=0])
print(counts)
Output:
[1278 564 208 47 36 30 191 54 24 18 40 915 26 20
24 527 56 677 63 57 61 1544 512 21 45 187 39 132
48 55 160 46 28 18 55 48 35 92 29 88 53 55
24 52 114 49 34 34 38 52 38 53 69]
So I have found the number of points in each cluster, but I'm not sure how to select only the clusters which have less than 100 points.
You may find indexes of the labels where you have counts less than 100:
ls, cs = np.unique(labels,return_counts=True)
dic = dict(zip(ls,cs))
idx = [i for i,label in enumerate(labels) if dic[label] <100 and label >= 0]
Then you may apply resulting index to your DBSCAN results and labels like (more or less):
plt.scatter(only_xy[idx, 0], only_xy[idx, 1],
c=clustering.labels_[idx], cmap='rainbow')