Search code examples
rexcelopenxlsx

R function to get many worksheets from excel files and use worksheet names as object names


I have received some excel .xlsx files, each file with many worksheets. I am trying to get all worksheets, each as one R data frame named by the excel worksheet name. The following example with readText.xlsx file from openxlsx works fine for this purpose:

x <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
sheet_names <- openxlsx::getSheetNames(x)
my_fun <- function(i) {
  openxlsx::read.xlsx(x, sheet = i)
}
for (i in seq_along(sheet_names)) {
  assign(sheet_names[i], lapply(sheet_names, my_fun)[[i]])
}

I would like to convert this into a function, but just cannot get the results. Any suggestions would be appreciated.

get_sheets <- function(x) {
  sheet_names <- openxlsx::getSheetNames(x)
  my_fun <- function(i) {
    openxlsx::read.xlsx(x, sheet = i)
  }
  for (i in seq_along(sheet_names)) {
    assign(sheet_names[i], lapply(sheet_names, my_fun)[[i]])
  }
}

x <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
get_sheets(x)

Solution

  • from ?assign

    assign(x, value, pos = -1, envir = as.environment(pos), inherits = FALSE, immediate = TRUE)

    pos where to do the assignment. By default, assigns into the current environment. See ‘Details’ for other possibilities.

    envir the environment to use.

    So when you are using assign inside the function get_sheets, it is assigning the names into the current environment, the environment of get_sheets, and this environment is created when the function starts executing and gets destroyed when the function is done executing along with all the objects created within it. And that's why you are not seeing the created data.frames in your global environment.

    So, to solve it, you can use envir = .GlobalEnv to make those assignments explicitly into the Global Environment.

    get_sheets <- function(x) {
      sheet_names <- openxlsx::getSheetNames(x)
      my_fun <- function(i) {
        openxlsx::read.xlsx(x, sheet = i)
      }
      for (i in seq_along(sheet_names)) {
        assign(sheet_names[i], lapply(sheet_names, my_fun)[[i]], envir = .GlobalEnv)
      }
    }
    
    x <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
    get_sheets(x)