Search code examples
rdplyrpurrrmutateacross

How to use purrr::map with dplyr::mutate and across in R


I've looked at a couple of previous examples with mutate, across, and map but struggled to fully understand them. Apologies if this question is a duplicate. Here are the other two posts that may be relevant - Using mutate(across(...)) with purrr::map and purrr::pmap with dplyr::mutate.

Background:

I have a list of ten dataframes. All of them have a similar number of column and names. (Some may have one or two more.) My goal is to combine all the columns into one dataframe, and I plan to use bind_rows() or list_rbind().

Problem:

Because of the poor quality of the raw CSV data files, the same column in different files may be of a different class. As such, running bind_rows() returns this error.

Error in `bind_rows()`:
! Can't combine `..1$cfv` <character> and `..2$cfv` <double>.
Backtrace:
 1. data_list %>% bind_rows()
 2. dplyr::bind_rows(.)

Attempted solution:

Because I don't know for sure the class of each column and some dataframes may be missing a column, my thought to overcoming this problem is to first converting all columns to the character class, binding them together, and then converting the relevant columns back to numeric.

To convert all columns of all dataframes in the list to the character class, I thought to use mutate, across, and map.

This is the code.

data = data_list %>% 
  map(mutate(across(everything(), ~ as.character(.))))

However, it does not work and returns this error message.

Error in `across()`:
! Must only be used inside data-masking verbs like `mutate()`, `filter()`, and `map()`.
Backtrace:
 1. data_list %>% ...
 6. dplyr::across(everything(), ~as.character(.))

Question:

How do I use mutate(), across(), and map() together? Alternatively, better ways to combine the different dataframes in the list would be welcome, too.

Thanks in advance.


Solution

  • This is what you want. Remember that the fn argument of map is a function that will be applied to each element. That function should accept an argument, .i.e. the . in this line of code that represents the data frame.

    data_list %>%
      map(~mutate(., across(everything(), as.character)))
    

    In your attempt (which was close!), there is no argument in your function.

    Here's a reprex.

    library(tidyverse)
    
    dat <- as_tibble(mtcars)
    
    # what you want to do on one data frame
    dat %>%
      mutate(across(everything(), as.character))
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    
    data_list <- list(dat, dat, dat)
    
    # applied to a list
    data_list %>%
      map(~mutate(., across(everything(), as.character)))
    #> [[1]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    #> 
    #> [[2]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    #> 
    #> [[3]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    
    # depending on your taste, you might like to use the ... arguments instead
    data_list %>%
      map(mutate, across(everything(), as.character))
    #> [[1]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    #> 
    #> [[2]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    #> 
    #> [[3]]
    #> # A tibble: 32 × 11
    #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
    #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
    #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
    #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
    #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
    #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
    #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
    #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
    #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
    #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
    #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
    #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
    #> # ℹ 22 more rows
    

    Created on 2023-05-05 with reprex v2.0.2