Problem: In the following example I am not able to get the column names from DL_Import
into the many worksheets of the workbook wb
and further into the many worksheets of the output Excel file output.xlsx
...
Context: I have a data frame called DL_Import
with close to 8,000 rows of data and 19 columns. I need to break this up into multiple worksheets for the saved xlsx file in order to process them separately downstream. I have completed this by doing the following:
DL_Split <- split(DL_Import,rep(1:500,each=1000))
chunk <- 1000
numRows <- nrow(DL_Import)
DL_row <- rep(1:ceiling(numRows/chunk),each=chunk)[1:numRows]
DL_Split <- split(DL_Import,DL_row)
numSheets <- length(DL_Split)
wb <- createWorkbook()
for (i in 1:numSheets)
{
addWorksheet(wb, i)
writeData(wb, i, DL_Split[i], startRow = 1, startCol = 1, colNames = TRUE, rowNames = TRUE)
}
saveWorkbook(wb, file = "output.xlsx", overwrite = TRUE)
The workbook saves correctly with, in this case, 8 worksheets with the names "1", "2", ... "8" but is missing the column titles in row 1 in each worksheets.
Question: How can I ensure the column names from DL_Import
gets propagated into each of the wb
worksheets to ensure it's written to output.xlsx
file?
the error is that you were using [
instead of [[
, the difference between these two is fairly simple yet critical:
[
returns an object of the same class and can select one or more elements.[[
returns only one element without coercing it to the type of the parent structure.if you try this code, you'll find out that they're different:
class(DL_Split[1])
#> [1] "list"
class(DL_Split[[1]])
#> [1] "data.frame"
this is why openxlsx::writeData
failed to get the column names and row names.
the corrected code should look like this
for (i in 1:numSheets)
{
addWorksheet(wb, i)
writeData(wb, i, DL_Split[[i]], startRow = 1, startCol = 1, colNames = TRUE, rowNames = TRUE)
}