Search code examples
rregressionlapplyveganrda

How to use lapply to loop through a regression?


In the following dataset, I want to computer the function `rda` every 72 rows. Do this 10 times. Following [this solution][1] of lapply, I tried the same for a regression `rda`

set.seed(111)
df <- data.frame(sp1 = rep(rtruncnorm(10, a=0, b=1, mean = 0.50, sd = 0.2), times = 10),
                  sp2 = rep(rtruncnorm(10, a=0, b=1, mean = 0.70, sd = 0.1), times = 10),
                  env1 = rep(rtruncnorm(10, a=0, b=1, mean = 0.45, sd = 0.6), times = 10),
                  env2 = rep(rtruncnorm(10, a=0, b=1, mean = 0.65, sd = 0.6), times = 10)
                  )


library(vegan)
spe.rda  <- rda(df[,c(1:2)] ~ ., data = df[,c(3:4)]) #works fine without lapply

#Trying to loop it over the 10 seeds
spe.rda  <- lapply(split(df[1:4], df$seed), rda(df[,c(1:2)] ~ ., data = df[,c(3:4)]))

Error in match.fun(FUN) : 'rda(df[, c(1:2)] ~ ., data = df[, c(3:4)])' is not a function, character or symbol

Similarly, I tried applying the same to the ordiR2step function

output <- lapply(split(df[1:4], df$seed), 
ordiR2step(rda(df[,c(1:2)]~1, data=df[,c(3:4)]), scope= formula(spe.rda), direction= "forward", R2scope=TRUE, pstep=1000)) 
#Obviously it does not work as it's dependent on spe.rda

Solution

  • Your example does not have df$seed: let's generate a bogus seed first:

    ## continuing with your example before first call to lapply
    df$seed <- gl(10,10) # 10 groups, each 10 observations
    

    Then call lapply with unnamed inline function:

    lapply(split(df[1:4], df$seed), function(x) rda(x[,1:2] ~ ., data= x[,3:4]))