Search code examples
rstatistics-bootstrap

function works (boot.stepAIC ) but throws an error inside another function - environment issue?


I realized a strange behavior today with in my R code. I tried a package {boot.StepAIC} which includes a bootstrap function for the results of the stepwise regression with the AIC. However I do not think the statistical background is here the problem (I hope so).
I can use the function at the top level of R. This is my example code.

require(MASS)
require(boot.StepAIC)

n<-100
x<-rnorm(n); y<-rnorm(n,sd=2); z<-rnorm(n,sd=3); res<-x+y+z+rnorm(n,sd=0.1)
dat.test<-as.data.frame(cbind(x,y,z,res))
form.1<-as.formula(res~x+y+z)
boot.stepAIC(lm(form.1, dat.test),dat.test) # should be OK - works at me

However, I wanted to wrap that in an own function. I pass the data and the formula to that function. But I get an error within boot.stepAIC() saying:

the model fit failed in 100 bootstrap samples Error in strsplit(nam.vars, ":") : non-character argument

# custom function
fun.boot.lm.stepAIC<-function(dat,form) {
  if(!inherits(form, "formula")) stop("No formula given")
  fit.lm<-lm(formula=form,data=dat)
  return(boot.stepAIC(object=fit.lm,data=dat))
 }
fun.boot.lm.stepAIC(dat=dat.test,form=form.1)
# results in an error 

So where is the mistake? I suppose it must have something to do with the local and global environment, doesn't it?


Solution

  • Using do.call as in anova test fails on lme fits created with pasted formula provides the answer.

    boot.stepAIC doesn't have access to form when run within a function; that can be recreated in the global environment like this; we see that lm is using form.1 as the formula, and removing it makes boot.stepAIC fail.

    > form.1<-as.formula(res~x+y+z)
    > mm <- lm(form.1, dat.test)
    > mm$call
    lm(formula = form.1, data = dat.test)
    > rm(form.1)
    > boot.stepAIC(mm,dat.test)
    # same error as OP
    

    Using do.call does work. Here I use as.name as well; otherwise the mm object carries around the entire dataset instead of just the name of it.

    > form.1<-as.formula(res~x+y+z)
    > mm <- do.call("lm", list(form.1, data=as.name("dat.test")))
    > mm$call
    lm(formula = res ~ x + y + z, data = dat.test)
    > rm(form.1)
    > boot.stepAIC(mm,dat.test)
    

    To apply this to the original problem, I'd do this:

    fun.boot.lm.stepAIC<-function(dat,form) {
      if(!inherits(form, "formula")) stop("No formula given")
      mm <- do.call("lm", list(form, data=as.name(dat)))
      do.call("boot.stepAIC", list(mm,data=as.name(dat)))
    }    
    form.1<-as.formula(res~x+y+z)
    fun.boot.lm.stepAIC(dat="dat.test",form=form1)
    

    This works too but the entire data set gets included in the final output object, and the final output to console, as well.

    fun.boot.lm.stepAIC<-function(dat,form) {
      if(!inherits(form, "formula")) stop("No formula given")
      mm <- do.call("lm", list(form, data=dat))
      boot.stepAIC(mm,data=dat)
    }    
    form.1<-as.formula(res~x+y+z)
    fun.boot.lm.stepAIC(dat=dat.test,form=form.1)