Search code examples
rdplyrpurrrtidyeval

how to update a dataframe in a purrr loop?


Consider this simple example

library(dplyr)
library(purrr)

mydata <- dplyr::data_frame('value' = c(1,2,3))
> mydata
# A tibble: 3 x 1
  value
  <dbl>
1    1.
2    2.
3    3.

I have a function that takes the dataframe and a number as arguments, and I would like to modify the dataframe in place at each iteration.

I have written the following, but it does not update the dataframe:

  myfunc <- function(df, numba){
  name_var <- paste('year_', quo_name(numba), sep ='')
  df <- df %>% mutate(!!name_var := 1)
  return(df)
}

seq(2006, 2007, by = 1) %>% 
    purrr::walk(function(x) {mydata <- myfunc(mydata, x)})

Unfortunately mydata is not modified correctly:

seq(2006, 2007, by = 1) %>% 
    map(function(x) {mydata <- myfunc(mydata, x)})

gives:

[[1]]
# A tibble: 3 x 2
  value year_2006
  <dbl>     <dbl>
1    1.        1.
2    2.        1.
3    3.        1.

[[2]]
# A tibble: 3 x 2
  value year_2007
  <dbl>     <dbl>
1    1.        1.
2    2.        1.
3    3.        1.

while the expected output should be

# A tibble: 3 x 3
  value year_2006 year_2007
  <dbl>     <dbl>     <dbl>
1    1.        1.        1.
2    2.        1.        1.
3    3.        1.        1.

What am I missing here? Thanks!


Solution

  • map returns results as a list. You can use map_dfc to bind results for each year by column then remove the extra value columns

        seq(2006, 2007, by = 1) %>% 
          map_dfc(function(x) {mydata <- myfunc(mydata, x)}) %>% 
          select(value, matches("year_"))
    
        # or even shorter
        seq(2006, 2007, by = 1) %>% 
          map_dfc(~ myfunc(mydata, .)) %>% 
          select(value, matches("year_"))
    
        # A tibble: 3 x 3
          value year_2006 year_2007
          <dbl>     <dbl>     <dbl>
        1    1.        1.        1.
        2    2.        1.        1.
        3    3.        1.        1.