Search code examples
rtidyversedplyrsummarize

R Summary table in percentage with summarise_at or _all using 3 different functions and reduce inner join


I have seen the post in here : https://github.com/tidyverse/dplyr/issues/3101 and tried to work with summarise_at, however it doesn't work with 3 functions, instead I found simpler with summarise_all. Is there any way to reduce the inner join from my code?

table to Result

  df_ask<- structure(list(Member = c("API", "API", "API", "API", "API", 
"KARA", "KARA", "KARA", "KARA", "KARA", "KARA", "KARA"), Year = c(2017, 
2017, 2017, 2018, 2018, 2017, 2017, 2017, 2018, 2018, 2018, 2019
), Name = c("kali", "kata", "kaga", "kami", "kara", "mara", "misi", 
"musu", "mate", "maki", "maku", "maji"), Response = c("declined", 
"yes, change request", "declined", "yes", "no", "no response", 
"yes, change request", "no response", "yes", "no", "yes", "no"
), Private = c("public", "private", "no response", "private", 
"public", "public", "private", "no response", "private", "public", 
"public", "private")), row.names = c(NA, -12L), class = c("tbl_df", 
"tbl", "data.frame"))

Note that, I want also to make a mutate the response column but only for 2017 data (declined -> no; yes, change request -> yes) and have the result in percentage.

The result should be something like this:

df_ans <- structure(list(Member = c("API", "API", "KARA", "KARA", "KARA"
), Year = c(2017, 2018, 2017, 2018, 2019), `Total Name` = c(3, 
2, 3, 3, 1), `Yes Response` = c(33.3333333333333, 33.3333333333333, 
33.3333333333333, 66.6666666666667, 0), `No Response` = c(66.6666666666667, 
33.3333333333333, 66.6666666666667, 33.3333333333333, 33.3333333333333
), Public = c(33.3333333333333, 33.3333333333333, 33.3333333333333, 
66.6666666666667, 0), Private = c(33.3333333333333, 33.3333333333333, 
33.3333333333333, 33.3333333333333, 33.3333333333333)), row.names = c(NA, 
-5L), class = c("tbl_df", "tbl", "data.frame"))

My code by far is:

    respon.table<-df_ask %>%
  group_by(Member, Year) %>%
 # mutate(Response= replace(Response, c(Response == "no response", Response.status == "submitted"), c("NOONO","aaaa")))

summarise_all(funs(Total.Name=sum(!is.na(.)), Yes.Response=sum(Response %in% c("Yes", "Yes, change reques")),
                   private=sum(Private=="public")))

Sure it is possible to do some manual works to get that result. However, I want to put it as table in shiny which I hope to make it run smooth and not given more loading time. Please help.


Solution

  • You ca use across() to apply a summarizing function to a set of columns

    df_ask %>%
      mutate(Yes.Response = Response %in% c("yes", "yes, change request"),
             No.Response = Response %in% c( "no", "declined","no response"),
             Is.Public = Private == "public",
             Is.Private = Private == "private") %>%
      group_by(Member, Year) %>%
      summarise(Total.Name = sum( !is.na( Name )),
                across( where(is.logical), ~paste0( round( 100 * sum(.) / Total.Name), "%")))
    
    # A tibble: 5 x 7
    # Groups:   Member [2]
      Member  Year Total.Name Yes.Response No.Response Is.Public Is.Private
      <chr>  <dbl>      <int> <chr>        <chr>       <chr>     <chr>     
    1 API     2017          3 33%          67%         33%       33%       
    2 API     2018          2 50%          50%         50%       50%       
    3 KARA    2017          3 33%          67%         33%       33%       
    4 KARA    2018          3 67%          33%         67%       33%       
    5 KARA    2019          1 0%           100%        0%        100%