Search code examples
rplotggplot2vegan

how to get ordispider-like clusters in ggplot with nmds?


I have just successfully managed to plot and ordisurf model on top of my non-metric multi dimensional scale plot. Code was used from this site. https://oliviarata.wordpress.com/2014/07/17/ordinations-in-ggplot2-v2-ordisurf/

However my problem is that I am having a hard time trying to figure out how to plot cluster graphics using ggplot. I've looked around and the closest I came to was this: R - add centroids to scatter plot

Where the answer involved creating centroids and extending lines from it to points but this was not done with a nmds object so I'm still puzzled.

I used vegan to run my nmds and gpplot for plotting. I would add my data but it's composed of two very large community and environment datasets. The nmds and the subsequent ordisurf function requires the full data to run.


Solution

  • Here is one way to do this, which should make its way into my ggvegan package at some point.

    library('vegan')
    library('ggplot2')
    

    For this example I'm going to use the Dutch dune meadow data set tha ships with vegan

    data(dune, dune.env)
    

    and I'll use the Management variable in dune.env as my cluster membership vector. Notice it is coded as a factor; you should ensure that whatever cluster membership vector you use is coded likewise.

    First the example ordination

    ord <- metaMDS(dune)
    

    Next, extract the NMDS scores

    scrs <- scores(ord, display = 'sites')
    

    To facilitate computing centroids, I add Management as a variable to the data frame of scores

    scrs <- cbind(as.data.frame(scrs), Management = dune.env$Management)
    

    Now we compute the group centroids, which are the mean coordinate on each axis, groupwise:

    cent <- aggregate(cbind(NMDS1, NMDS2) ~ Management, data = scrs, FUN = mean)
    

    To draw the spider, we need to us geom_segment() which requires coordiates to draw the segment from and to. Our to coordinates, the xend and yend aesthetics will be the centroids. So we need to replicate the group centroid for each observation in the group. This we facilitate by a left join via merge:

    segs <- merge(scrs, setNames(cent, c('Management','oNMDS1','oNMDS2')),
                  by = 'Management', sort = FALSE)
    

    notice that I rename the columns in cent so these do't get confused for columns of the same names in scrs — we want these centroid variables to have different names.

    Now we can plot

    ggplot(scrs, aes(x = NMDS1, y = NMDS2, colour = Management)) +
      geom_segment(data = segs,
                   mapping = aes(xend = oNMDS1, yend = oNMDS2)) + # spiders
      geom_point(data = cent, size = 5) +                         # centroids
      geom_point() +                                              # sample scores
      coord_fixed()                                               # same axis scaling
    

    Which produces

    enter image description here