Search code examples
rxlconnect

XLConnect causing failures on empty cells


I've got a bunch of Excel spreadsheets from which I need to read certain cells. I've built a loop to go through all the workbooks and worksheets, but I've foregone the looping in my question below to simplify.

Not all worksheets or cells will have data populated, but if a cell is blank, XLConnect returns an empty data frame, leading to errors.

To reproduce, create an xlsx file in your working directory called testFile.xlsx. On Sheet1, type anything you want in cells A1 and B2, then save and close the file. Then run the below code.

# Load packages
library(XLConnect)

# Define positions to read
positionsToRead = c('A1', 'B2')

# This will work as A1 and B2 have data
data.frame(sapply(positionsToRead, function(x) readWorksheetFromFile("testFile.xlsx",  sheet='Sheet1', region=x, header=FALSE)))

# Change positions to read
positionsToRead = c('A1', 'B2', 'C3')

# This will fail, as C3 has no data
data.frame(sapply(positionsToRead, function(x) readWorksheetFromFile("testFile.xlsx",  sheet='Sheet1', region=x, header=FALSE)))

The first will return:

  A1.Col1 B2.Col1
1   data1   data2

The second will return:

Error in data.frame(A1 = list(Col1 = "data1"), B2 = list(Col1 = "data2"),  : 
  arguments imply differing number of rows: 1, 0

Is there any way to make this work? I've searched, and found that XLConnect has a SetMissing function, but it appears to be for defining strings like blank to NA.


Solution

  • Try this:

    sapply(positionsToRead, 
           function(x) readWorksheetFromFile("testFile.xlsx",  
                                             sheet='Sheet1', 
                                             region=x, 
                                             header=FALSE,
                                             autofitRow = FALSE,
                                             autofitCol = FALSE))