I am following code from this R Bloggers link in order to run models on groups within my data using tidyr
and purrr
. However, I would like to use glmnet
rather than just lm
on my nested data. Unlike lm
, glmnet
/cv.glmnet
takes a model.matrix
as the x
argument and I need to abstract the formula fed to that model.matrix
and that is what is holding me up.
So this works:
library(purrr)
library(tidyr)
library(dplyr)
library(glmnet)
mod_test <- mtcars %>%
nest(-vs) %>%
mutate(cv_mod = map(data, ~ cv.glmnet(
x = model.matrix(data = ., .$mpg ~ .$cyl * .$hp)[,-1],
y = .$mpg
)))
mod_test
> mod_test
# A tibble: 2 x 3
vs data cv_mod
<dbl> <list> <list>
1 0 <tibble [18 x 10]> <S3: cv.glmnet>
2 1 <tibble [14 x 10]> <S3: cv.glmnet>
But when I try to create the formula for the model.matrix
separately, it does not.
mod_form <- as.formula(".$mpg ~ .$cyl * .$hp")
mod_test2 <- mtcars %>%
nest(-vs) %>%
mutate(cv_mod = map(data, ~ cv.glmnet(
x = model.matrix(data = ., mod_form)[,-1],
y = .$mpg
)))
Error in mutate_impl(.data, dots) : object '.' not found
First part, why Error in mutate_impl(.data, dots) : object '.' not found
? The folowing is my reasoning:
see manual of as.formula
:
Formulas created with as.formula will use the env argument for their environment.
When you create mod_test
: according to as.formula(object, env = parent.frame())
, it will be <environment: R_GlobalEnv>
.
Next,
A formula object has an associated environment, and this environment (rather than the parent environment) is used by model.frame to evaluate variables that are not found in the supplied data argument.
So, model.matrix
will look for columns like .$mpg
in data
. Apprently, those columns are called like mpg
not .$mpg
. Then it will looks for .$mpg
in env associated with the formula: R_GlobalEnv. There is no object called .
in global env. Therefore error was reported.
(correct me if some of this part is wrong.)
Second, solution, try:
mod_form <- mpg ~ cyl * hp
# or
mod_form <- as.formula('mpg ~ cyl * hp')