I need to call the mlogit() R function from inside another function.
This is a function for demonstrative purposes:
#-------------------------
# DEMO FUNCTION
#-------------------------
# f = formula (string)
# fData = data.frame
# cVar = choice variable (string)
# optVar = alternative variable (string)
##########################
mlogitSum <- function(f, fData, cVar="choice", optVar="option"){
library(mlogit)
r2 <- mlogit(as.formula(f), shape = "long", data = fData, alt.var=optVar, choice = cVar)
return(summary(r2))
}
Apparently there is an environment problem, so that variables not declared globally are not found by the mlogit() function as arguments.
This example doesn't work:
mydata <- read.csv(url("http://www.ats.ucla.edu/stat/r/dae/mlogit.csv"))
attach(mydata)
library(mlogit)
mydata$brand<-as.factor(mydata$brand)
mlData<-mlogit.data(mydata, varying=NULL, choice="brand", shape="wide")
myFormula <-"brand~1|female+age"
var1 <- "brand"
var2 <- "alt"
mlogitSum(myFormula, fData = mlData, var1, var2)
While if the variables are assigned in the main environment it works:
mydata <- read.csv(url("http://www.ats.ucla.edu/stat/r/dae/mlogit.csv"))
attach(mydata)
library(mlogit)
mydata$brand<-as.factor(mydata$brand)
fData<-mlogit.data(mydata, varying=NULL, choice="brand", shape="wide")
myFormula <-"brand~1|female+age"
cVar <- "brand"
optVar <- "alt"
mlogitSum(myFormula, fData, cVar, optVar)
Alternatively it works if I assign the variables globally from inside the function
#-------------------------
# DEMO FUNCTION
#-------------------------
# f = formula (string)
# fData = data.frame
# cVar = choice variable (string)
# optVar = alternative variable (string)
##########################
mlogitSum_rev <- function(f, fData, cVar="choice", optVar="option"){
fData<<-fData
cVar<<-cVar
optVar<<-optVar
#return(head(lcmData))
library(mlogit)
#mi serve per poi estrarre model.matrix(r2), per il resto sarebbe ridondante
r2 <- mlogit(as.formula(f), shape = "long", data = fData, alt.var=optVar, choice = cVar)
return(summary(r2))
}
mydata <- read.csv(url("http://www.ats.ucla.edu/stat/r/dae/mlogit.csv"))
attach(mydata)
library(mlogit)
mydata$brand<-as.factor(mydata$brand)
mlData<-mlogit.data(mydata, varying=NULL, choice="brand", shape="wide")
myFormula <-"brand~1|female+age"
var1 <- "brand"
var2 <- "alt"
mlogitSum_rev(myFormula, mlData, var1, var2)
Any idea on how to avoid to assign the variables globally?
tl;dr this appears to be a bug in mlogit
, which you can fix yourself (see below) or ask the maintainer to fix.
Deep inside mlogit
, the function tries to evaluate the data as follows:
nframe <- length(sys.calls()) ## line 11
...
data <- eval(mldata, sys.frame(which = nframe)) ## line 44
This is moderately sophisticated messing about with R's scoping structures -- it's trying to evaluate mldata
in the frame one above the current frame, and it will fail if someone does something tricky (but perfectly reasonable!) like call mlogit
from within a function.
I solved the problem (sort of!) by running fix(mlogit)
, which will dump you into an editor and allow you to modify the function. I changed line 44 to
data <- eval(mldata, parent.frame())
after which the code seemed to work.
If this works for you, you can either (1) fix()
mlogit every time you need to use it: (2) download a copy of the source (.tar.gz
) package, modify it, and install it; or (3) [preferably!] contact the package maintainer, let them know about the issue, and ask them to release a patched version ...
PS depending on your general data analysis protocol, you may want to get out of the habit of using attach
: Why is it not advisable to use attach() in R, and what should I use instead?