Search code examples
rsumaggregatefactorssummarize

HOW can i sum a categorical variable and aggregate by factor


So let me be a little more specific..... i have a dataset that has

  1. SOCCERTEAM -PLAYERS

  2. BARCA - MESSI

  3. BARCA - MESSI
  4. BARCA - MESSI
  5. BARCA - XAVI

  6. -RM - CR

  7. -RM - CR

  8. -RM - PEPE

  9. -RM -HIQUAIN etc(just an example not dataset)

as columns!!!

I want the answer to this question : " How can i find the top 5 teams according to how many players they used" *teams can use players more than once so finding the factor levels are not a possibility *so if barca used 15 players and Rm used 14 then BARCA is first.....


Solution

  • library(dplyr)
    
    df %>% 
      group_by(SOCCERTEAM) %>% 
      summarize(rank = n_distinct(PLAYERS)) %>%
      top_n(5, wt = rank)