I've got a bunch of Excel spreadsheets from which I need to read certain cells. I've built a loop to go through all the workbooks and worksheets, but I've foregone the looping in my question below to simplify.
Not all worksheets or cells will have data populated, but if a cell is blank, XLConnect returns an empty data frame, leading to errors.
To reproduce, create an xlsx
file in your working directory called testFile.xlsx
. On Sheet1
, type anything you want in cells A1
and B2
, then save and close the file. Then run the below code.
# Load packages
library(XLConnect)
# Define positions to read
positionsToRead = c('A1', 'B2')
# This will work as A1 and B2 have data
data.frame(sapply(positionsToRead, function(x) readWorksheetFromFile("testFile.xlsx", sheet='Sheet1', region=x, header=FALSE)))
# Change positions to read
positionsToRead = c('A1', 'B2', 'C3')
# This will fail, as C3 has no data
data.frame(sapply(positionsToRead, function(x) readWorksheetFromFile("testFile.xlsx", sheet='Sheet1', region=x, header=FALSE)))
The first will return:
A1.Col1 B2.Col1
1 data1 data2
The second will return:
Error in data.frame(A1 = list(Col1 = "data1"), B2 = list(Col1 = "data2"), :
arguments imply differing number of rows: 1, 0
Is there any way to make this work?
I've searched, and found that XLConnect
has a SetMissing
function, but it appears to be for defining strings like blank
to NA
.
Try this:
sapply(positionsToRead,
function(x) readWorksheetFromFile("testFile.xlsx",
sheet='Sheet1',
region=x,
header=FALSE,
autofitRow = FALSE,
autofitCol = FALSE))