Search code examples
rdataframesplitexport-to-excelopenxlsx

Split large dataframe in R and output into separate sheets in a single Excel workbook


Suppose I have the following dataframe in R and I am looking to split the dataframe into separate Excel sheets categorised by the Fruit column


+--------+-------+
| Fruit  | Price |
+--------+-------+
| Apple  |    12 |
| Apple  |    14 |
| Apple  |    15 |
| Orange |     2 |
| Orange |     4 |
| Orange |     6 |
| Pear   |     3 |
| Pear   |     6 |
| Pear   |     9 |
+--------+-------+

After splitting the dataframe into 3 separate dataframes (Apple, Orange and Pear), I intend to export each dataframe into a separate Excel sheets (named Apple, Orange and Pear) but stored within the same Excel workbook Out.xlsx. However, the below R code does not work. The output is an Excel workbook Out.xlsx with only a single sheet, Pear, containing the Pear dataframe.

library(openxlsx)
df <- read_excel("Export excel test.xlsx")
output <- split(df, df$Fruit)
for (i in 1:length(output)){write.xlsx(x = output[i],
file = "Out.xlsx", sheetName = names(output[i]),append = TRUE)}

Would anyone be able to help on this? My actual dataframe has over 4 million rows, hence I need to split the dataframe into separate sheets to circumvent Excel's row limitations


Solution

  • It looks like you are using commands from the xlsx package.

    The xlsx package also provides the write.xlsx function, which allows you to append to an existing workbook.

    library(xlsx)
    
    write.xlsx(subset(iris, subset=Species=="setosa"), 
        file="iris.xlsx", sheetName = "setosa")
    
    write.xlsx(subset(iris, subset=Species=="versicolor"), 
        file="iris.xlsx", sheetName = "versicolor", append=TRUE)
    
    write.xlsx(subset(iris, subset=Species=="virginica"), 
        file="iris.xlsx", sheetName = "virginica", append=TRUE)
    

    The openxlsx package does this slightly differently. Here I'll use a loop instead.

    library(openxlsx)
    
    output <- split(iris, iris$Species)
    
    wb <- createWorkbook()
    
    for (i in 1:length(output)) {
      addWorksheet(wb, sheetName=names(output[i]))
      writeData(wb, sheet=names(output[i]), x=output[[i]]) # Note [[]]
    }
    
    saveWorkbook(wb, "iris.xlsx", overwrite = TRUE)