Search code examples
rregressionvegan

Using envfit (vegan) to calculate species scores


I am running an NMDS and have a few questions regarding the envfit() function in the package. I have read the documentation for this function and numerous posts on SO and others about vegan, envfit(), and species scores in general.

  • I have seen both envfit() and wascore() used to calculate species scores for ordination techniques. By default, metaMDS() uses wascore(). This uses weighted averaging, which I understand. I am having a harder time understanding envfit(). Do envfit() and wascore( yield the same results? Is wascore() preferable given that it is the default? I realize that in some situations, wascore() might not be an option (ie. negative values), as mentioned in this post. How to get 'species score' for ordination with metaMDS()?

  • Given that envfit() and wascore() both seem to be used for species scores, they should yield similar results, right? I am hoping that we could do a proof of this here... The following shows species scores determined using metaMDS() using the default wascore():

data(varespec)
ord <- metaMDS(varespec)
species.scores <- as.data.frame(scores(ord, "species"))
species.scores

wascore() makes sense to me, it uses weighted averaging. There is a good explanation of weighted averaging for species scores in Analysis of Ecological Data by McCune and Grace (2002) p. 150.

Could somebody help me breakdown envfit?

species.envfit <- envfit(ord, varespec, choices = c(1,2), permutations = 999)
species.scores.envfit <- as.data.frame(scores(species.envfit, display = "vectors"))
species.scores.envfit

"The values that you see in the table are the standardised coefficients from the linear regression used to project the vectors into the ordination. These are directions for arrows of unit length." - comment from Plotted envfit vectors not matching NMDS scores

^Could somebody please show me what linear model is being run here and what standardized value is being extracted?

species.scores
species.scores.envfit

These values are very different from each other. What am I missing here?

This is my first SO post, please have mercy. I would have asked a question on some of the other relevant threads, but I am the dregs of SO and don't even have the reputation to comment.

Thanks!


Solution

  • Q: Do wascores() and envfit() give the same result?

    No they do not give the same result as these are doing two quite different things. In this answer I have explained how envfit() works. wascores() takes the coordinates of the points in the nmds space and computes the mean on each dimension, weighting observations by the abundance of the species at each point. Hence the species score returned by wascores() is a weighted centroid in the NMDS space for each species, where the weights are the abundances of the species. envfit() fits vectors that point in the direction of increasing abundance. This implies a plane over the NMDS ordination where abundance increase linearly from any point on the plane as you move parallel to the arrow, whereas wascores() are best thought of as optima, where the abundance declines as you move away from the weighted centroid, although I think this analogy is looser than say with a CA ordination.

    The issue about being optimal or not, is an issue if you passed in standardised data; as the answer you linked to shows, this would imply negative weights which doesn't work. Typically one doesn't standardise species abundances — there are transformations that we apply like converting to proportions, square root or log transformations, normalizing the data to the interval 0-1 — but these wouldn't give you negative abundances so you;re less likely to run into that issue.

    envfit() in an NMDS is not necessarily a good thing as we wouldn't expect abundances to vary linearly over the ordination space. The wascores() are better as they imply non-linear abundances, but they are a little hackish in NMDS. ordisurf() is a better option in general as it adds a GAM (smooth) surface instead of the plane implied by the vectors, but you can't show more than one or a few surfaces on the ordination, whereas you can add as many species WA scores or arrows as you want.

    The basic issue here is the assumption that envfit() and wascores() should give the same results. There is no reason to assume that as these are fundamentally different approaches to computing "species scores" for NMDS and each comes with it's own assumptions and advantages and disadvantages.