Search code examples
sqlrdata-manipulation

How can I percentage makeup of TRUE from a vector?


This is the code used to derive the first table in my question.

JH %>% group_by(ATT_ID, CAR=="B") %>%
summarize(count = n(), .groups = "drop")
ATT_ID CAR == "B" Count
ONE FALSE 1
TWO TRUE 1
THREE TRUE 3
THREE FALSE 5
FOUR FALSE 2
FIVE TRUE 4
SIX TRUE 8
SIX FALSE 4

How can I get the table above to look like:

ATT_ID Percentage of "B"
ONE 0%
TWO 100%
THREE 37.5%
FOUR 0%
FIVE 100%
SIX 67%
  • Notice how some ID's are seen twice so as to show the presence of both FALSE & TRUE whereas other ID's appear once to showcase the presence of only one or the other.

Thank you


Solution

  • You can do the following:

    dt %>%
      group_by(ATT_ID) %>%
      summarize(perc = sprintf("%3.1f%%", 100*sum(Count*`CAR =="B"`)/sum(Count)))
    

    Output:

    # A tibble: 6 × 2
      ATT_ID perc  
      <chr>  <chr> 
    1 FIVE   100.0%
    2 FOUR   0.0%  
    3 ONE    0.0%  
    4 SIX    66.7% 
    5 THREE  37.5% 
    6 TWO    100.0%
    

    Input:

    structure(list(ATT_ID = c("ONE", "TWO", "THREE", "THREE", "FOUR", 
    "FIVE", "SIX", "SIX"), `CAR =="B"` = c(FALSE, TRUE, TRUE, FALSE, 
    FALSE, TRUE, TRUE, FALSE), Count = c(1, 1, 3, 5, 2, 4, 8, 4)), class = c("tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -8L))