Search code examples
rdplyrdata-manipulationreadxl

Read in Specific Tabs from Excel into R


I am trying to read a bunch of tabs from an excel file into R. The issue is that I only want some of the tabs, not all of them. When I try to select certain columns, I get an error because the tabs I do not want to read in don't contain the x1:x4 columns I want. My current code is below.

file.list <- "C:/Users/xxxx/Documents/file.xlsx"

df.list <- lapply(file.list,function(x) {
  sheets <- excel_sheets(x)
  dfs <- lapply(sheets, function(y) {
    read_excel(x, sheet = y, skip = 5) %>% 
      clean_names() %>% 
      select(x1:x4)
  })
  names(dfs) <- sheets
  dfs 
})[[1]]

list2env(df.list ,.GlobalEnv)

Solution

  • select can take helper functions. Instead of specifying directly, wrap with any_of. According to ?select

    any_of(): Same as all_of(), except that no error is thrown for names that don't exist.

    df.list <- lapply(file.list,function(x) {
      sheets <- excel_sheets(x)
      dfs <- lapply(sheets, function(y) {
        read_excel(x, sheet = y, skip = 5) %>% 
          clean_names() %>% 
          select(any_of(stringr::str_c('x', 1:4)))
      })