Search code examples
rpiperenamepurrrreadxl

purrr::set_names using pipe operator in R


How do I edit the below R pipe so that I can change the column names before reading all the sheets into a single dataframe? Currently, the set_names function is applied to sheet names in the Excel, rather than the data columns in each sheet. There are 3 sheets, and each sheet has 4 columns of data.

library(purrr)
library(readxl)

file_path <- "Test.xlsx"
dfraw <- file_path %>% 
         excel_sheets() %>%
         set_names(., nm = c('A','B','C','D')) %>% #this line shows the error.
         map_dfr(.f = ~read_excel(path = file_path, sheet = .x), .id = "Currency")

Error message:

Error in `set_names()`:
! The size of `nm` (4) must be compatible with the size of `x` (3).

Solution

  • Use set_names within the map_dfr function.

    library(dplyr)
    library(purrr)
    library(readxl)
    
    file_path <- "Test.xlsx"
    
    dfraw <- file_path %>% 
      excel_sheets() %>%
      map_dfr(.f = ~read_excel(path = file_path, sheet = .x) %>%
                set_names(nm = c('A','B','C','D')), .id = "Currency")
    

    Note that map_dfr has been superseded and it is recommended to use map + list_rbind.

    dfraw <- file_path %>% 
      excel_sheets() %>%
      map(.f = ~read_excel(path = file_path, sheet = .x) %>%
                set_names(nm = c('A','B','C','D'))) %>% 
      list_rbind(names_to = "Currency")