I have a work book that has 10 sheets: Half has the word "Survey" in its sheet name and the other half has the word "Census" in its sheet name.
I want to read in all the sheets that has the word "Survey" in its sheet name and append them into a separate workbook.
Sheet names are: WA Survey, Survey MD, DC Survey, VA Survey, Survey GA.
What I have:
library(XLConnect)
wb <- loadWorkbook("Data.xls")
dataMD <- readWorksheet(wb, sheet = "Survey MD", startCol = 1, endCol = 10)
dataWA <- readWorksheet(wb, sheet = "Survey WA", startCol = 1, endCol = 10)
dataDC <- readWorksheet(wb, sheet = "Survey DC", startCol = 1, endCol = 10)
dataVA <- readWorksheet(wb, sheet = "Survey VA", startCol = 1, endCol = 10)
dataGA <- readWorksheet(wb, sheet = "Survey GA", startCol = 1, endCol = 10)
total <- merge(dataMD,dataWA,dataDC,dataVA,dataGA,by="ID")
How can I make this simpler? I was hoping I can complete this task in 3-4 lines?
You could use the readxl package like this and filter the Survey data sheets before reading everything in one go.
library(readxl)
# file including path if needed
file <- "Data.xls"
# read the sheets and only keep the Survey sheets
sheets <- excel_sheets(file)
sheets <- sheets[grep("Survey", sheets)]
# read the data, only first 10 columns (A:J)
excel_data <- lapply(sheets, read_excel, path = file, range = cell_cols("A:J") )
You end up with a list of data. If all columns are identical you can use do.call("rbind", excel_data)
or dplyr::bind_rows