library(dplyr)
library(tidyr)
library(forcats)
library(readxl)
Using the gss_cat dataset from the forcats package I created a grouped and summarized dataframe then split the data by the marital and race variables (If there's a better tidyverse method than using lapply here that would be a great bonus).
Survey<-gss_cat%>%
select(marital,race,relig,denom)%>%
group_by(marital,race,relig,denom)%>%
summarise(Count=n())%>%
mutate(Perc=paste0(round(100*Count/sum(Count),2),"%"))%>%
drop_na()
Survey%>%split(.$marital)%>%
lapply(function(x) split(x,x$race))
However I'm stuck trying to export the final list to an Excel file with readxl. More specifically, I want to export select tables in the list to separate Excel tabs. For example, divided by race, so that each race category is on a different tab in the spreadsheet.
First, readxl
does not write Excel files. See the thread for issue 231 on the readxl GitHub page. It looks like the writexl
package (not [yet] part of the tidyverse) is recommended instead.
Second, split()
can take a list as an argument.
list_of_dfs <- survey %>% split(list(.$marital, .$race), sep='_')
Putting it together, assuming you've installed writexl
:
require(tidyverse)
require(forcats)
require(writexl)
survey <-
gss_cat %>%
select(marital, race, relig, denom) %>%
group_by(marital, race, relig, denom) %>%
summarise(Count=n()) %>%
mutate(Perc=paste0(round(100*Count/sum(Count), 2), "%")) %>%
drop_na()
list_of_dfs <- survey %>% split(list(.$marital, .$race), sep='_')
write_xlsx(list_of_dfs, 'out.xlsx')
Note that there are no checks on the suitability of the names of the worksheets that write_xlsx
tries to create. If your data contained illegal characters in the marital
or race
column, or if you used an illegal character in the sep
argument to split()
, then the operation would fail. (Try using sep = ':'
if you don't believe me.)