Search code examples
rk-meansdbscan

View the names of clustered objects


I have a sample of cities of data and I am Clustering them for some parameters. But I'm having trouble representing them visually, first used the clusplot, but I can not understand why the scales change, since even plotting with only 2 components and data ranging from -1 to 1, have ranges from -4 to 4 and -2 to 2, as you can see in the example 1.

[clusplot[1]

So I use the hullplot DBSCAN package, but that plot does not show in your output the name of the cities, as clusplot, see 2. Could someone give me a suggestion of how to add these names to the chart?

hullplot


Solution

  • I would try to use ggplot2 and ggrepel packages for this. I borrowed code to make the convex hull from this question.

    set.seed(175)
    library(ggplot2)
    library(ggrepel) # Or first install.packages("ggrepel")
    
    # Make the cluster
    mtcars$cluster <- as.factor(kmeans(mtcars, 3)$cluster)
    
    # Get the convex hull for the axes you want to plot
    hull_df <- plyr::ddply(mtcars, "cluster", function(dta) {
        hull <- chull(dta$mpg, dta$disp)
        dta[c(hull, hull[1]), ]
    })
    
    ggplot(mtcars, aes(mpg, disp, color = cluster, fill = cluster)) + 
      geom_point() + 
      geom_polygon(data = hull_df, alpha = 0.5) + 
      geom_text_repel(aes(label = row.names(mtcars)))
    

    Result: enter image description here