Search code examples
rdplyrpurrrnse

Variable column names in rowwise dplyr


There are a number of similar questions on SO, but those typically about applying a simple sum or mean, or for some other reason the situation can be largely simplified. Here I want to copy the values of certain columns under keys within a list of lists column. Here is a minimal example:

Setup

library(dplyr)
library(purrr)
library(rlang)
library(tibble)
library(magrittr)

keys <- c('a', 'b')

tbl <-
    tibble(
        x = list(
            list(f = 'foo', g = 'bar'),
            list(f = 'bar', g = 'foo')
        ),
        a = list(
            list(i = 1, j = 2),
            list(i = 4, j = 7)
        ),
        b = list(
            list(i = 5, j = 3),
            list(i = 2, j = 9)
        )
    )

This works and does what I wish for:

i.e. in the first element of x, under keys a and b there are the first values of columns a and b, respectively

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(x %>% inset2('a', a) %>% inset2('b', b))
    )

However, let's imagine we have 12 keys, that would mean 12 inset2 calls. It would be nice to handle them together or loop through them. I attempted this with purrr::reduce, however, I couldn't find a way to access the source columns within reduce:

Iterating the keys and trying to use them as a character index on .data:

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, .data[[key]])
                },
                .init = x
            )
        )
    )

Error: object 'key' not found
7: quos(..., .ignore_empty = "all")
6: dplyr_quosures(...)
5: force(dots)
4: mutate_cols(.data, dplyr_quosures(...), by)
3: mutate.data.frame(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))
2: mutate(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))
1: tbl %>% rowwise %>% mutate(x = list(reduce(keys, function(lst,
       key) {
       inset2(lst, key, .data[[key]])
   }, .init = x)))

The above error happens only in .data, key itself, as expected, exists as a string within the function. I also tried to convert key to a symbol:

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, !!sym(key))
                },
                .init = x
            )
        )
    )

Error: object 'key' not found
9: is_symbol(x)
8: sym(key)
7: quos(..., .ignore_empty = "all")
6: dplyr_quosures(...)
5: force(dots)
4: mutate_cols(.data, dplyr_quosures(...), by)
3: mutate.data.frame(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))
2: mutate(., x = list(reduce(keys, function(lst, key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))
1: tbl %>% rowwise %>% mutate(x = list(reduce(keys, function(lst,
       key) {
       inset2(lst, key, !!sym(key))
   }, .init = x)))

This version runs without error, but assigns the actual symbols (a, b, ...) to the list instead their values from the row:

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce(
                keys,
                function(lst, key) {
                    inset2(lst, key, sym(key))
                },
                .init = x
            )
        )
    )

tbl2$x[[1]]$a
a  # this `a` is a symbol

Then I tried to resolve the keys first and supposedly pass the values to the function, though I'm not sure what val contains below. It runs without error, but all values in x will be NULL. I think it means !!!syms(keys) returns NULL, hence reduce2 does zero loops, and returns NULL.

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(
            reduce2(
                keys,
                !!!syms(keys),
                function(lst, key, val) {
                    inset2(lst, key, val)
                },
                .init = x
            )
        )
    )

Finally I returned to the idea of using keys as character vector, and relying on .data. Also, probably it's more efficient to do the whole operation in one go, instead of moving the elements key by key. So I tried to extract all elements and move them by utils::modifyList:

tbl2 <-
    tbl %>%
    rowwise %>%
    mutate(
        x = list(modifyList(x, .data[keys]))
    )

Error in `mutate()`:
ℹ In argument: `x = list(modifyList(x, .data[keys]))`.
ℹ In row 1.
Caused by error in `.data[keys]`:
! `[` is not supported by the `.data` pronoun, use `[[` or $ instead.
Run `rlang::last_trace()` to see where the error occurred.

At this point, I found an actual solution that I'll post as an answer. But I think this is an interesting example, and I'm wondering if someone comes up with a trivial solution that I missed (all the stuff above seems way too complex and ugly for me).


Solution

  • Here's another option relying on purrr::transpose():

    library(purrr)
    library(dplyr)
    
    tbl |>
      mutate(x = map2(x, transpose(pick(all_of(keys))), ~ c(.x, .y)))
    

    Check identical:

    identical(
      tbl %>%
        rowwise %>%
        mutate(
          x = list(x %>% inset2('a', a) %>% inset2('b', b))
        ) |>
        ungroup(),
      tbl |>
        mutate(x = map2(x, transpose(pick(all_of(keys))), ~ c(.x, .y)))
      )
    )
    [1] TRUE