Below is a pmap()
operation that requires my data to be in wide format. I perform a few simulations each day and capture the max value per simulation as post_max
.
library(tidyverse)
POST_SIMS <- 2
CONDITIONS <- 3
DURATION <- 2
df <-
tibble(
day = rep(1:DURATION, each = CONDITIONS),
condition = rep(LETTERS[1:CONDITIONS], times = DURATION)
) |>
rowwise() |>
mutate(post = list(rnorm(POST_SIMS, 0, 1))) |>
ungroup()
df_wide <- df |>
pivot_wider(
id_cols = c(day),
names_from = "condition",
values_from = 'post'
)
df_wide |>
mutate(
post_max =
pmap(
.l = list(A,B,C), # This works, but needs manual updating
.f = pmax)
) |>
unnest()
The problem is that I have to mannually list the unique conditions when I reach pmap(list(A,B,C), pmax)
and this in undesirable because my goal is to write a simulation function that can accommodate any number of conditions.
Is there a way to capture the unique conditions generated in df
and supply that as an argument to pmap() as I try and fail to do below?
my_conditions <- noquote(unique(df$condition))
df_wide |>
mutate(
post_max =
pmap(
.l = list(my_conditions), # How do I do this part?
.f = pmax)
) |>
unnest()
The .l
argument supplied to list()
is baffling me a bit. This is obviously not a string. I write it as .l = list(A,B,C)
, which is usually convenient but obscures what pmap()
is ingesting. I assume I am dealing with some kind of tidy evaluation, but the flexible nature of this argument's length is different than my typical tidy eval applications where I simply name my columns as quosures.
list(A,B,C)
in this context just selects columns A
, B
& C
from mutate()
.data
argument (df_wide
), adding those to a list basically generates a tibble-like structure. Try replacing list(A,B,C)
with pick(-day)
:
glimpse(df_wide)
#> Rows: 2
#> Columns: 4
#> $ day <int> 1, 2
#> $ A <list> <-1.4857029, -0.2090127>, <-1.6142362, 0.2935161>
#> $ B <list> <2.610475, -1.604595>, <-1.455556395, 0.003465559>
#> $ C <list> <-0.06067370, 0.09182582>, <-0.5745877, -1.0695619>
df_wide |>
mutate(
post_max =
pmap(
.l = pick(-day),
.f = pmax)
) |>
unnest()
#> Warning: `cols` is now required when using `unnest()`.
#> ℹ Please use `cols = c(A, B, C, post_max)`.
#> # A tibble: 4 × 5
#> day A B C post_max
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 -1.49 2.61 -0.0607 2.61
#> 2 1 -0.209 -1.60 0.0918 0.0918
#> 3 2 -1.61 -1.46 -0.575 -0.575
#> 4 2 0.294 0.00347 -1.07 0.294
rowwise()
+ max(c_across())
should deliver the same result, though I would guess it's bit easier to follow:
df_wide |>
unnest_longer(-day) |>
rowwise() |>
mutate(post_max = max(c_across(-day))) |>
ungroup()