Search code examples
rlapplyxlsxreadxl

Using the lapply function to import data from Excel


I am trying to import data from 150 different files due to a primitive STATA command which doesn't have an append functionality or is compatible with the outreg2 command. I decide to take it to R since I am more familiar with importing and exporting there.

Each .xlsx file holds the exact same format only changing in values and label description. There are exactly 150 of them.

So far I managed:

#library(readxl)
file.list <- list.files(pattern='*.xlsx', recursive = TRUE)
df.list <- lapply(file.list, read_excel)

#library(dplyr)
df1 <- bind_rows(df.list, .id= "id")

This returns a dataframe like seen below: Event study results sheet 1. Which is exactly as I want it to turn out.

However, the files the lapply command reads, also hold sheet number 2,3, and 4 which I would like to import in a similar fashion (create appended tables as seen in the picture). Simply adjusting the function to:

df.list2 <- lapply(file.list, read_excel(sheet = 2))

Sadly doesn't work and requires me to add a "path" argument. Adding it:

df.list2 <- lapply(file.list, read_excel(path = "D:/sietse/Documents/RUG/Master/Thesis/Data/STATA", sheet = 2))

Gives me the error:

Error in file(con, "rb") : cannot open the connection In addition: Warning message: In file(con, "rb") : cannot open file 'D:/sietse/Documents/RUG/Master/Thesis/Data/STATA': Permission denied

After some research I know that this has to do with specification of file names, however; specifying any filename would defeat the purpose of the function to begin with and leaves me with a lot of manual work.

Does anyone know how to solve this or a better way to approach it? Much appreciated.


Solution

  • You can add the sheet as a param like this:

    df.list <- lapply(file.list, read_excel, sheet=2)
    

    Or, like this:

    df.list <- lapply(file.list, function(f) read_excel(f, sheet=2))