Search code examples
rpcavegan

Projecting subsequent years of species data onto PCA ordination


I am running a PCA on Hellinger-transformed species data for multiple sites in a single year (ex: 1995). I want to calculate site scores from those same sites in the next year (1996) by projecting data from 1996 onto the PCA ordination constructed from the first year's data (1995). This is in order to get a measure of distance moved in ordination space from one year to another. Is there an R package that allows me to do this?


Solution

  • As @Richard Telford mentions, you can use the predict method for class "rda" if you use rda() from the vegan package to do this PCA.

    Here's an example using the Dutch dune data set, assuming that the first half come from year 1 and the second half of the samples from year 2 (they don't).

    It's a little involved to do all the steps of generating the final plot, but the predictions are easy to do; the main thing is that you need to have the same columns in both year's data sets (the same species).

    library("vegan")
    data(dune)
    take <- seq_len(floor(nrow(dune) / 2))
    pca <- rda(dune[take, ], scale = TRUE)
    

    Set a default for scaling of scores and extract the observed site scores

    scl <- "sites"
    scrs <- scores(pca, display = "sites", choices = 1:2, scaling = scl)
    

    Predict from pca for the left out dune samples. Note you have to specify type = "wa" to get site scores.

    pred <- predict(pca, newdata = dune[-take, ], type = "sites",
                    scaling = scl)
    

    When we create the plot we need axis limits than span scores of both plots, in case:

    lims <- apply(rbind(scrs, pred[, 1:2]), 2L, range)
    

    (Note pred contains predictions for all axes, not just the first two; you can adjust this using the rank argument.)

    Now we can plot both sets of scores, year 1 as points and year 2 as arrows drawn from the year 1 to the year 2 prediction:

    plot(pca, display = "sites", type = "n", scaling = scl,
         xlim = lims[,1], ylim = lims[,2])
    arrows(scrs[,1], scrs[,2], pred[,1], pred[,2], col = "blue",
           length = 0.1)
    points(pca, display = "sites", scaling = scl, pch = 21, bg = "white")
    

    This produces:

    enter image description here