Search code examples
rlisttidyevalpmap

How to flexibly supply a varying length argument to .l of pmap()


Below is a pmap() operation that requires my data to be in wide format. I perform a few simulations each day and capture the max value per simulation as post_max.

library(tidyverse)

POST_SIMS <- 2
CONDITIONS <- 3
DURATION <- 2

df <-
    tibble(
        day = rep(1:DURATION, each = CONDITIONS),
        condition = rep(LETTERS[1:CONDITIONS], times = DURATION)
    ) |>
    rowwise() |>
    mutate(post = list(rnorm(POST_SIMS, 0, 1))) |>
    ungroup()

df_wide <- df |> 
    pivot_wider(
        id_cols = c(day), 
        names_from = "condition",
        values_from = 'post'
    ) 

df_wide |> 
    mutate(
        post_max = 
            pmap(
                .l = list(A,B,C), # This works, but needs manual updating
                .f = pmax)
    ) |> 
    unnest()

The problem is that I have to mannually list the unique conditions when I reach pmap(list(A,B,C), pmax) and this in undesirable because my goal is to write a simulation function that can accommodate any number of conditions.

Is there a way to capture the unique conditions generated in df and supply that as an argument to pmap() as I try and fail to do below?

my_conditions <- noquote(unique(df$condition)) 

df_wide |> 
    mutate(
        post_max = 
            pmap(
                .l = list(my_conditions), # How do I do this part? 
                .f = pmax)
    ) |> 
    unnest()

The .l argument supplied to list() is baffling me a bit. This is obviously not a string. I write it as .l = list(A,B,C), which is usually convenient but obscures what pmap() is ingesting. I assume I am dealing with some kind of tidy evaluation, but the flexible nature of this argument's length is different than my typical tidy eval applications where I simply name my columns as quosures.


Solution

  • list(A,B,C) in this context just selects columns A, B & C from mutate() .data argument (df_wide), adding those to a list basically generates a tibble-like structure. Try replacing list(A,B,C) with pick(-day):

    glimpse(df_wide)
    #> Rows: 2
    #> Columns: 4
    #> $ day <int> 1, 2
    #> $ A   <list> <-1.4857029, -0.2090127>, <-1.6142362, 0.2935161>
    #> $ B   <list> <2.610475, -1.604595>, <-1.455556395, 0.003465559>
    #> $ C   <list> <-0.06067370, 0.09182582>, <-0.5745877, -1.0695619>
    
    df_wide |> 
      mutate(
        post_max = 
          pmap(
            .l = pick(-day),
            .f = pmax)
      ) |> 
      unnest()
    #> Warning: `cols` is now required when using `unnest()`.
    #> ℹ Please use `cols = c(A, B, C, post_max)`.
    #> # A tibble: 4 × 5
    #>     day      A        B       C post_max
    #>   <int>  <dbl>    <dbl>   <dbl>    <dbl>
    #> 1     1 -1.49   2.61    -0.0607   2.61  
    #> 2     1 -0.209 -1.60     0.0918   0.0918
    #> 3     2 -1.61  -1.46    -0.575   -0.575 
    #> 4     2  0.294  0.00347 -1.07     0.294
    

    rowwise() + max(c_across()) should deliver the same result, though I would guess it's bit easier to follow:

    df_wide |> 
      unnest_longer(-day) |>
      rowwise() |>
      mutate(post_max = max(c_across(-day))) |>
      ungroup()