Search code examples
rpurrrmagrittr

Iterating over listed data frames within a piped purrr anonymous function call


Using purrr::map and the magrittr pipe, I am trying generate a new column with values equal to a substring of the existing column.

I can illustrate what I'm trying to do with the following toy dataset:

library(tidyverse)
library(purrr)

test <- list(tibble(geoid_1970 = c(123, 456), 
                    name_1970 = c("here", "there"), 
                    pop_1970 = c(1, 2)),
             tibble(geoid_1980 = c(234, 567), 
                    name_1980 = c("here", "there"), 
                    pop_1970 = c(3, 4))
)

Within each listed data frame, I want a column equal to the relevant year. Without iterating, the code I have is:

data <- map(test, ~ .x %>% mutate(year = as.integer(str_sub(names(test[[1]][1]), -4))))

Of course, this returns a year of 1970 in both listed data frames, which I don't want. (I want 1970 in the first and 1980 in the second.)

In addition, it's not piped, and my attempt to pipe it throws an error:

data <- test %>% map(~ .x %>% mutate(year = as.integer(str_sub(names(.x[[1]][1]), -4))))
# > Error: Problem with `mutate()` input `year`.
# > x Input `year` can't be recycled to size 2.
# > ℹ Input `year` is `as.integer(str_sub(names(.x[[1]][1]), -4))`.
# > ℹ Input `year` must be size 2 or 1, not 0.

How can I iterate over each listed data frame using the pipe?


Solution

  • Try:

    test %>% map(~.x %>% mutate(year = as.integer(str_sub(names(.x[1]), -4))))
    
    [[1]]
    # A tibble: 2 x 4
      geoid_1970 name_1970 pop_1970  year
           <dbl> <chr>        <dbl> <int>
    1        123 here             1  1970
    2        456 there            2  1970
    
    [[2]]
    # A tibble: 2 x 4
      geoid_1980 name_1980 pop_1970  year
           <dbl> <chr>        <dbl> <int>
    1        234 here             3  1980
    2        567 there            4  1980