I have a dataframe with many rows (>300,000) which I would like to write to multiple Excel files using xlsx package.
Sample data -
myData <- data.frame(Letter=c("A","B","C","D","E","F","G","H","I","J"),
Fruit = c("Apples", "Pears", "Oranges",
"Carrots", "Mangoes", "Potatoes",
"Bananas", "Plums","Grapes",
"Broccoli"),
Country = c("Scotland", "Scotland", "Scotland",
"England", "England", "England",
"Wales", "Wales", "Wales",
"Ireland"))
So, from the sample data I would like to split my Excel workbooks by the 'Country' column. There would be 4 Excel files with the country name as the filename and I would like them to look like below -
Letter Fruit Country
------------------------
A Apples Scotland
B Pears Scotland
C Oranges Scotland
Taking the 'Country' column out of the dataset would be a bonus.
I've figured out how to split these into multiple dataframes using a For loop, and then I could write a 'write.xlsx' line for each country. But my list of countries is dynamic. See below example -
for (i in unique(myData$Country)) {
test <- paste0("excel", i)
assign(test, myData[myData$Country == i,])
}
I've struggled to fit the write.xlsx function into that loop, but perhaps there's a better way.
Could anybody help please?
I would just split them into a list, then use an lapply
for the write.xlsx. I'd suggest lapply
over a for-loop
.
myDatalist <- split(myData, myData$Country)
lapply(1:length(myDatalist), function(x) write.xlsx(myDatalist[[x]],
file = paste0(names(myDatalist[x]), ".xlsx"), row.names = FALSE))
All this does is apply a function to a list.