Search code examples
rexcelextract

Extracting data from multiple excel files using R


I have many excel files in a folder that has similar structure: enter image description here

I would like to output it to something like that: enter image description here

This is what I have tried. While I am able to extract the range of cells in B15:G17, how can I append the cell B8 in a column? Also, possible to have row 10 as the column header?

library("readxl")

file.list <- list.files(path=".", pattern=".xlsx")

df.list <- lapply(file.list, read_excel, sheet=1, range=("B15:G17"), col_names=FALSE, col_types = NULL)
df <- data.frame(df.list)

Solution

  • The read.xlsx from openxlsx is a bit more flexible as we can specify the rows/cols argument to read a particular cell as well as use startRow to specify the starting row

    library(openxlsx)
    lst1 <- lapply(file.list, function(x) read.xlsx(x, sheet = 1, startRow = 10, cols = 3:6, rows = 15:17))
    

    Also, read the specific cell separately, and then bind it as a new column 'ID' in each datasets of the list with Map and cbind

    lst2 <- lapply(file.list, function(x) setNames(read.xlsx(x, sheet = 1,
           rows = 8, cols = 2), 'ID'))
    
    Map(cbind, ID = lst2, lst1)