Search code examples
rgroup-bydplyrtidyversesummarize

tidyverse: count number of a specific level when summarizing


I would like, when summarizing after grouping, to count the number of a specific level of another factor.

In the working example below, I would like to count the number of "male" levels in each group. I've tried many things with count, tally and so on but cannot find a straightforward and neat way to do it.

df <- data.frame(Group=replicate(20, sample(c("A","B"), 1)),
                 Value=rnorm(20),
                 Factor=replicate(20, sample(c("male","female"), 1)))
df %>% 
  group_by(Group) %>% 
  summarize(Value = mean(Value),
            n_male = ???)

Thanks for your help!


Solution

  • We can use sum on a logical vector i.e. Factor == "male". The TRUE/FALSE will be coerced to 1/0 to get the frequency of 'male' elements when we do the sum

    df %>%
       group_by(Group) %>% 
       summarise(Value = mean(Value), 
                 n_male = sum(Factor=="male"))