Search code examples
rcluster-analysisdbscan

How to tag data point to a cluster?


I have completed and plotted the DBSCAN cluster in R markdown.

This is my code currently:

dbscan.8=fpc::dbscan(current.matrix, eps=2, MinPts=log(33359)) #list generated

fviz_cluster(dbscan.8, data=current.matrix, stand=FALSE, ellipse=FALSE, 
             show.clust.cent=FALSE, geom="point", palette="jco", 
             ggtheme=theme_classic()) # Plot the clusters

How do I add a new column in the original dataframe (current.matrix), that contains the cluster that each row belongs to? so it will look something like that:

enter image description here

Thank you!


Solution

  • Using an example dataset:

    library(factoextra)
    library(fpc)
    
    dat = data.frame(scale(iris[,-5]))
    
    clus = dbscan(dat,1.5)
    

    The clustering looks like this

    viz = fviz_cluster(clus, data=dat, 
    stand=FALSE, ellipse=FALSE, show.clust.cent=FALSE, geom="point",
    palette="jco")
    
    print(viz)
    

    enter image description here

    The cluster information is already stored in the object fromm fviz_cluster:

    head(viz$data)
      name         x          y    coord cluster
    1    1 -2.257141 -0.4784238 5.323576       1
    2    2 -2.074013  0.6718827 4.752956       1
    3    3 -2.356335  0.3407664 5.668437       1
    4    4 -2.291707  0.5953999 5.606421       1
    5    5 -2.381863 -0.6446757 6.088877       1
    6    6 -2.068701 -1.4842053 6.482388       1
    

    The cluster is also stored under the dbscan object, as $clusters . So you can do:

    dat$cluster = viz$data$cluster
    

    or:

    dat$cluster = clus$cluster