Search code examples
rvectorizationxlsgdatasapply

read.xls - read in variable-length list of sheets, with their names


Given several .xls files with varying number of sheets, I am reading them into R usingread.xls from the gdata package. I have two related issues (solving the second issue should solve the first):

  1. It is unknown ahead of time how many sheets each .xls file will have, and in fact this value will vary from one file to the next.
  2. I need to capture the name of the sheet, which is relevant data

Right now, to resolve (1), I am using try() and iterating over sheet numbers until I hit an error.

How can I grab a list of the names of the sheet so that I can iterate over them?


Solution

  • See the sheetCount and sheetNames functions (on same help page) in gdata. If xls <- "a.xls", say, then reading all sheets of a spreadsheet into a list, one sheet per component, is just this:

    sapply(sheetNames(xls), read.xls, xls = xls, simplify = FALSE)
    

    Note that the components will be named using the names of the sheets. Depending on the content it might make sense to remove simplify = FALSE.