Search code examples
rpartitioningvegan

Nullmodel with presence absence data in vegan - R


I am trying to run a nullmodel analysis on a beta diversity partitioning task involving a hirarchy in the plots with presence absence data. I read that it is recomended to use the "tswap" method, however it runs only when I use other methods such as "r00". Could you help me finding out what the problem is here?

library(parallel)
library(vegan)

# Setting up data ----
# community
com <- matrix(rbinom(120, 1, 0.5),nrow=12)
# plot hirachy 
plt_hir <- data.frame(plot=c(1:nrow(com)),
                      treatment = c(rep("reg1_t1",nrow(com)/4),rep("reg1_t2",nrow(com)/4),rep("reg2_t1",nrow(com)/4),rep("reg2_t2",nrow(com)/4)),
                      region=c(rep("reg1",nrow(com)/2),rep("reg2",nrow(com)/2)),
                      country=rep("country",nrow(com)))
# to numeric
plt_hir <- data.frame(apply(plt_hir, 2, function(x) as.numeric(factor(x))))


# Multipart ----

# matrix
y = com
# level matrix
x = plt_hir
# number of null model computations
nsimul = 100
# null model to be computed
method = "tswap"
# get and set number of cores for parallel computation (do not use all of them otherwise computer will be slowed)
cors <- detectCores()-1


# NULL MODEL ----

# define null model
nm <- nullmodel(com, method)

## socket type cluster, works on all platforms
cl <- makeCluster(cors)
clusterEvalQ(cl, library(vegan))
clusterExport(cl, c("simulate", "x", "nm"))
set.seed(3)
smlist <- parLapply(cl, 1:nsimul, function(i)  simulate(nm, nsim = 1, thin = 1))
stopCluster(cl)
smlist <- smbind(smlist, MARGIN=3)

I understand that the error is due to failed sanity checks but I do not get why? The method should be adapted to presenence absence matrices isn't it? Thanks for your help!


Solution

  • If you call smbind with argument strict = TRUE (default), function checks that data are strictly correct which means that every combined element of the list has same start index, end index and equal thin. Obviously you fail, as each sequence is of length 1 and starts at different values. Setting strict = FALSE in smbind() will perform the same checks, and only warn loudly but combine the elements.

    You should not call trial swap (method = "tswap") like that: it is a sequential method producing chains where each null model matrix is derived from a previous one. Therefore you should not split your analysis by the number of simulations, but by the number of cores, and run a sequence of models within each core. After that the output from each core will have the same start, end and thin indices and smbind(..., strict = TRUE) will also work.

    Caveat 1: check how to set random number seed for each sequence (core). If you are not careful, the sequences may all start from the same seed and you get just copies of the chains from each core.

    Caveat 2: You should absolutely set thin > 1 and also use burnin in sequential methods, in particular in trial swap. Each step will change the input data only a bit – and with trial swap that bit may be that nothing changes (read its documentation) – and therefore you need to thin (only save, say, one of thousand steps), and discard the first steps, because they all start from the same input matrix and sequences are almost identical for a long time. You should remove so many steps that the initial states in chains are independent. I suggest you apply the MCMC analysis tools of the coda package if you use a sequential method.