Search code examples
rgroup-bydplyruniquesummarize

Finding percentage using group_by and summarise in R through dplyr


I have some data about peoples' academic background. The user information can have duplicates so I use Unique_Elements to extract each entry only once.

demographics %>%
group_by(Academic_Level) %>%
summarise(Unique_Elements = n_distinct(userID))

The output looks something like:

Academic_Level     Unique_Elements
Freshman           22
Sophomore          76
Junior             87
Senior             56
NA                 10  # Non responding candidates

The total value of N = 253.

Now if I want to edit the above code to get percentages, what should I be doing?

I have seen the following two related posts, but they do not help me. Any advice on this would be highly appreciated. Thanks!

Relative frequencies / proportions with dplyr

Finding percentage in a sub-group using group_by and summarise


Solution

  • We can try

    demographics %>%
      group_by(Academic_Level) %>%
      summarise(Unique_Elements = n_distinct(userID)) %>%
      mutate(perc = 100 * Unique_Elements/sum(Unique_Elements))