Search code examples
rlapplylm

Calling update within a lapply within a function, why isn't it working?


This a a follow up question from Error in calling `lm` in a `lapply` with `weights` argument but it may not be the same problem (but still related).

Here is a reproducible example:

dd <- data.frame(y = rnorm(100),
                 x1 = rnorm(100),
                 x2 = rnorm(100),
                 x3 = rnorm(100),
                 x4 = rnorm(100),
                 wg = runif(100,1,100))

ls.form <- list(
  formula(y~x1+x2),
  formula(y~x3+x4),
  formula(y~x1|x2|x3),
  formula(y~x1+x2+x3+x4)
)

I have a function that takes different arguments (1- a subsample, 2- a colname for the weights argument, 3- a list of formulas to try and 4- the data.frame to use)

f1 <- function(samp, dat, forms, wgt){
  baselm <- lm(y~x1, data = dat[samp,], weights = dat[samp,wgt])
  lapply(forms, update, object = baselm)
}

If I call the function, I get an error:

f1(1:66, dat = dd, forms = ls.form, wgt = "wg")
 Error in is.data.frame(data) : object 'dat' not found 

I don't really get why it doesn't find the dat object, it should be part of the fonction environment. The problem is in the update part of the code as if you remove this line from the function, the code works.

At the end, this function will be call with a lapply

lapply(list(1:66, 33:99), f1, dat=dd, forms = ls.form, wgt="wg")

Solution

  • I think your problems are due to the scoping rules used by lm which are quite frankly a pain in the r-squared.

    One option is to use do.call to get it to work, but you get some ugly output when it deparses the inputs to give the call used for the standard print method.

    f1 <- function(samp, dat, forms, wgt){
      baselm <- do.call(lm,list(formula=y~x1, data = dat[samp,], weights = dat[samp,wgt]))
      lapply(forms, update, object = baselm)
    }
    

    A better way is to use an eval(substitute(...)) construct which gives the output you originally expected:

    f2 <- function(samp, dat, forms, wgt){
      baselm <- eval(substitute(lm(y~x1, data = dat[samp,], weights = dat[samp,wgt])))
      lapply(forms, update, object = baselm)
    }