I'm trying to learn more about the functionalities of R's purrr
using these exercises.
The task (exercise 8) is the following:
mtcars
dataset by cyl
.qsec ~ hp
for each group of observations by cyl
.mtcars
.My current code is:
library(tidyverse)
mtcars %>%
split(mtcars$cyl) %>%
map(~ lm(.x$qsec ~ .x$hp)) %>%
map(~ predict(.x, newdata = list(mtcars)))
However, this only applies one model on one group of mtcars
, so that the output is:
$`4`
1 2 3 4 5 6 7 8
18.98872 19.43308 18.96005 19.37574 19.57642 19.39008 18.93138 19.37574
9 10 11
19.01739 18.70203 18.75937
$`6`
1 2 3 4 5 6 7
18.51998 18.51998 18.51998 18.74090 17.94558 17.94558 15.64799
$`8`
1 2 3 4 5 6 7 8
17.37861 16.13783 17.28998 17.28998 17.28998 16.84684 16.66959 16.40371
9 10 11 12 13 14
17.82174 17.82174 16.13783 17.37861 15.80104 14.54254
The desired output, as I understand it, would be a list of predicted values with three list elements of 32 values each. How would I need to revise the code? Thank you.
The issue is that you passed the vectors directly into the formula inside lm()
. Instead pass the dataset to the data=
argument:
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.3.1
pred <- mtcars %>%
split(mtcars$cyl) %>%
map(~ lm(qsec ~ hp, data = .x)) %>%
map(~ predict(.x, newdata = mtcars))
lapply(pred, length)
#> $`4`
#> [1] 32
#>
#> $`6`
#> [1] 32
#>
#> $`8`
#> [1] 32