I've looked at a couple of previous examples with mutate
, across
, and map
but struggled to fully understand them. Apologies if this question is a duplicate. Here are the other two posts that may be relevant - Using mutate(across(...)) with purrr::map and
purrr::pmap with dplyr::mutate.
Background:
I have a list of ten dataframes. All of them have a similar number of column and names. (Some may have one or two more.) My goal is to combine all the columns into one dataframe, and I plan to use bind_rows()
or list_rbind()
.
Problem:
Because of the poor quality of the raw CSV data files, the same column in different files may be of a different class. As such, running bind_rows()
returns this error.
Error in `bind_rows()`:
! Can't combine `..1$cfv` <character> and `..2$cfv` <double>.
Backtrace:
1. data_list %>% bind_rows()
2. dplyr::bind_rows(.)
Attempted solution:
Because I don't know for sure the class of each column and some dataframes may be missing a column, my thought to overcoming this problem is to first converting all columns to the character class, binding them together, and then converting the relevant columns back to numeric.
To convert all columns of all dataframes in the list to the character class, I thought to use mutate
, across
, and map
.
This is the code.
data = data_list %>%
map(mutate(across(everything(), ~ as.character(.))))
However, it does not work and returns this error message.
Error in `across()`:
! Must only be used inside data-masking verbs like `mutate()`, `filter()`, and `map()`.
Backtrace:
1. data_list %>% ...
6. dplyr::across(everything(), ~as.character(.))
Question:
How do I use mutate()
, across()
, and map()
together? Alternatively, better ways to combine the different dataframes in the list would be welcome, too.
Thanks in advance.
This is what you want. Remember that the fn argument of map is a function that will be applied to each element. That function should accept an argument, .i.e. the .
in this line of code that represents the data frame.
data_list %>%
map(~mutate(., across(everything(), as.character)))
In your attempt (which was close!), there is no argument in your function.
Here's a reprex.
library(tidyverse)
dat <- as_tibble(mtcars)
# what you want to do on one data frame
dat %>%
mutate(across(everything(), as.character))
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
data_list <- list(dat, dat, dat)
# applied to a list
data_list %>%
map(~mutate(., across(everything(), as.character)))
#> [[1]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
#>
#> [[2]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
#>
#> [[3]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
# depending on your taste, you might like to use the ... arguments instead
data_list %>%
map(mutate, across(everything(), as.character))
#> [[1]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
#>
#> [[2]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
#>
#> [[3]]
#> # A tibble: 32 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
#> 2 21 6 160 110 3.9 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.02 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.22 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.84 0 0 3 4
#> 8 24.4 4 146.7 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 140.8 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 167.6 123 3.92 3.44 18.3 1 0 4 4
#> # ℹ 22 more rows
Created on 2023-05-05 with reprex v2.0.2