Search code examples
rr-xlsx

read.xlsx2 | Skipping if sheetName does not exist


I am trying to read multiple excel files in a folder using the read.xlsx2 function. I only need to read a particular sheet titled 'Returns' or 'Prices'.

Is there a way I can give an 'OR' argument in the function and also skip a file if it contains neither of the sheets?

P.s.: Each file will have either a 'Returns' or a 'Prices' sheet or neither but not both so there cannot be a clash.

Thanks


Solution

  • You could read all the sheet names of the file and using intersect select one of 'Returns' or 'Prices' whichever is present in the sheet and read the excel file with that sheet.

    Using readxl you can do this as :

    library(readxl)
    
    all_files <- list.files(pattern = '\\.xlsx$')
    
    result <- lapply(all_files, function(x) {
      all_sheets <- excel_sheets(x)  
      correct_sheet <- intersect(all_sheets, c('Returns', 'Prices'))
      if(length(correct_sheet)) read_xlsx(x, correct_sheet)
    })
    

    result will have a list of dataframes. If you want to combine the data into one dataframe and if they have same column names you can use do.call(rbind, result)