Search code examples
rdplyrgroup-bycase-whensummarize

How to use a counting function and case_when simultaneously in R?


I have a dataframe, and I want to count the number of observations for each name that meet a certain criteria:

library(dplyr)
test <- tibble(name = c("Justin", "Corey", "Sibley", "Justin", "Corey", "Sibley", "Justin", "Corey", "Sibley"),
               class = c("Bio", "Bio", "Bio", "Psych", "Pysch", "Psych", "English", "English", "English"),
               result = c("Fail", "Pass", "Pass", "Fail", "Pass", "Pass", "Fail", "Fail", "Pass"))

In the above example, the courses the students took are either STEM (e.g., "Bio" & "Psych") or literature ("English"). I want to make two new columns, one that says their STEM score, and one that contains their literature score, where the code will count 1 for each pass.

The answer should look like:

library(dplyr)
answer <- tibble(name = c("Justin", "Corey", "Sibley"),
                 stem_assessment = c(0, 2, 2),
                 lit_assessment = c(0, 0, 1))

I've tried experimenting with case_when(), group_by(), n(), summarize(), and count(), but I can't seem to crack it.


Solution

  • group_by and summarise should do -

    library(dplyr)
    
    stem <- c("Bio", "Pysch", "Psych")
    lit <- c('English')
    
    test %>%
      group_by(name) %>%
      summarise(stem_assessment = sum(class %in% stem & result == "Pass"),
                lit_assessment = sum(class %in% lit & result == "Pass"))
    
    #   name   stem_assessment lit_assessment
    #  <chr>            <int>          <int>
    #1 Corey                2              0
    #2 Justin               0              0
    #3 Sibley               2              1