Search code examples
rtidyversepurrr

Why is map2() function doesn't find my column name?


I'm using the map2() function to select elements from a list column with vectors using purr in tidyverse. While it work with a mock example with generic names:

df <- tibble(
  x = list(1:3, 4:6),
  y = c(2, 3)
) 

#This works:
df %>% mutate(z = unlist(map2(x, y, ~ .x[.y])))  

But when I give the same column a name purr doesn't find the column:

df <- df %>% 
  mutate(vol = x)

#This doesn't work
df %>% mutate(z = unlist(map2(vol, y, ~ .vol[.y])))    

Error in mutate(): ! Problem while computing z = unlist(map2(vol, y, ~.vol[.y])). Caused by error in .f(): ! object '.vol' not found


Solution

  • When using the formula notation with map2(), the default parameters are called .x and .y, whatever the name of the argument passed to the function. So in this case replace .vol with .x:

    df  |> 
        mutate(z = unlist(map2(vol, y, ~ .x[.y]))) 
    

    Note the map2 documentation states that .f must be:

    A function, specified in one of the following ways:
    • A named function.
    • An anonymous function, e.g. (x, y) x + y or function(x, y) x + y.
    • A formula, e.g. ~ .x + .y. You must use .x to refer to the current element of x and .y to refer to the current element of y.

    (Emphasis mine.)

    The docs also state that the formula notation is [o]nly recommended if you require backward compatibility with older versions of R. To use a named parameter instead, e.g. vol, you can use an anonymous function:

    df |>
        mutate(z = unlist(map2(
            vol, y, \(vol, y) vol[y]
        )))
    #   x             y vol           z
    #   <list>    <dbl> <list>    <int>
    # 1 <int [3]>     2 <int [3]>     2
    # 2 <int [3]>     3 <int [3]>     6