I am splitting a dataframe by city and the resulting list of dataframes export to their own Excel sheets, but the header of each dataframe is undesirably coming out as Atlanta.type
and Atlanta.kpi
instead of type
and kpi
. I'd like to understand why this is happening and how I can fix it to meet my need. My desired output should look like this (but it doesn't):
Here's a basic dataframe similar to the one in real life, and I'll split it by location into a list:
library(tidyverse)
df_Main <- expand_grid(location = c("New York", "Boston", "Atlanta"),
type = c("A"),
kpi = c(1.5,3))
> df_Main
# A tibble: 6 x 3
location type kpi
<chr> <chr> <dbl>
1 New York A 1.5
2 New York A 3
3 Boston A 1.5
4 Boston A 3
5 Atlanta A 1.5
6 Atlanta A 3
Now split it into a list of dataframes, one for each sheet I'll create:
lst_Main <- split(df_Main, f = df_Main$location)
> lst_Main
$Atlanta
# A tibble: 2 x 3
location type kpi
<chr> <chr> <dbl>
1 Atlanta A 1.5
2 Atlanta A 3
$Boston
# A tibble: 2 x 3
location type kpi
<chr> <chr> <dbl>
1 Boston A 1.5
2 Boston A 3
$`New York`
# A tibble: 2 x 3
location type kpi
<chr> <chr> <dbl>
1 New York A 1.5
2 New York A 3
I put the list into a workbook using library(xlsx)
, and while this does work, the column names in the resulting Excel file show the Atlanta.type
and Atlanta.kpi
column names:
library(xlsx)
wb = createWorkbook()
for (df in names(lst_Main)) {
sheet = createSheet(wb, df)
addDataFrame(lst_Main[df], sheet = sheet, col.names = TRUE, row.names = FALSE)
}
saveWorkbook(wb, file = "./Excel Output/test.xlsx")
I thought that perhaps it was because I didn't double-bracket the list, so I alternatively tried:
for (df in names(lst_Main)) {
sheet = createSheet(wb, df)
addDataFrame(lst_Main[[df]], sheet = sheet, col.names = TRUE, row.names = FALSE)
}
This fixed the headers in the way I wanted them, the data isn't useful:
As a last resort, I tried just adding a single straight dataframe to a workbook, and this turned out to export the good headers with the bad data, so perhaps my use of lists isn't the problem (?):
wb = createWorkbook()
sheet = createSheet(wb, "df_Main")
addDataFrame(df_Main, sheet = sheet, col.names = TRUE, row.names = FALSE)
Based on Trial 3, I don't believe my use of lists is the problem (but maybe?). Primarily, I just want to know what is causing me to get the wrong output, is it my syntax or is my use of library(xlsx)
just problematic (I know it uses java somehow). If the library(xlsx)
package is the problem, what is a more reliable way to achieve these ends?
Consider wrapping with unname
for (df in names(lst_Main)) {
sheet = createSheet(wb, df)
addDataFrame(unname(lst_Main[df]), sheet = sheet, col.names = TRUE, row.names = FALSE)
}
-output
This also works with single file, if it is wrapped in a list
wb = createWorkbook()
sheet = createSheet(wb, "df_Main")
addDataFrame(list(df_Main), sheet = sheet, col.names = TRUE, row.names = FALSE)
-output
The main reason OP's single data/multiple datasets are not working is because they are not data.frame
, but tibble
. Consider changing it to data.frame
and it will work
df_Main <- as.data.frame(df_Main)
wb = createWorkbook()
sheet = createSheet(wb, "df_Main")
addDataFrame(df_Main, sheet = sheet, col.names = TRUE, row.names = FALSE)
-output
As part of negative check, reconvert back to tibble
and do this again
df_Main <- as_tibble(df_Main)
wb = createWorkbook()
sheet = createSheet(wb, "df_Main")
addDataFrame(df_Main, sheet = sheet, col.names = TRUE, row.names = FALSE)
The function itself says addDataFrame
and the expected input 'x' according to ?addDataFrame
is
x - a data.frame.