Search code examples
rtidyversereadxl

How to Export Each Grouped Table in a List of Tables to a Different Excel Tab Using ReadXL and Tidyverse


library(dplyr)
library(tidyr)
library(forcats)
library(readxl)    

Using the gss_cat dataset from the forcats package I created a grouped and summarized dataframe then split the data by the marital and race variables (If there's a better tidyverse method than using lapply here that would be a great bonus).

Survey<-gss_cat%>%
select(marital,race,relig,denom)%>%
group_by(marital,race,relig,denom)%>%
summarise(Count=n())%>%
mutate(Perc=paste0(round(100*Count/sum(Count),2),"%"))%>%
drop_na()

Survey%>%split(.$marital)%>%
lapply(function(x) split(x,x$race))

However I'm stuck trying to export the final list to an Excel file with readxl. More specifically, I want to export select tables in the list to separate Excel tabs. For example, divided by race, so that each race category is on a different tab in the spreadsheet.


Solution

  • First, readxl does not write Excel files. See the thread for issue 231 on the readxl GitHub page. It looks like the writexl package (not [yet] part of the tidyverse) is recommended instead.

    Second, split() can take a list as an argument.

    list_of_dfs <- survey %>% split(list(.$marital, .$race), sep='_')
    

    Putting it together, assuming you've installed writexl:

    require(tidyverse) 
    require(forcats)
    require(writexl)
    
    survey <- 
        gss_cat %>%
            select(marital, race, relig, denom) %>%
            group_by(marital, race, relig, denom) %>%
            summarise(Count=n()) %>%
            mutate(Perc=paste0(round(100*Count/sum(Count), 2), "%")) %>%
            drop_na()
    
    list_of_dfs <- survey %>% split(list(.$marital, .$race), sep='_')
    
    write_xlsx(list_of_dfs, 'out.xlsx')
    

    Note that there are no checks on the suitability of the names of the worksheets that write_xlsx tries to create. If your data contained illegal characters in the marital or race column, or if you used an illegal character in the sep argument to split(), then the operation would fail. (Try using sep = ':' if you don't believe me.)