I have a dataframe 'df' where I want to summarize how many times each 'user' has a higher 'total' value for each head-to-head 'game'. My data frame looks like this:
game | user | total |
---|---|---|
1 | L | 55 |
1 | J | 60 |
2 | L | 64 |
2 | J | 77 |
3 | L | 90 |
3 | J | 67 |
4 | L | 98 |
4 | J | 88 |
5 | L | 71 |
5 | J | 92 |
The summary would state that L had a higer total in 2 games and J had a higher total in 3 games.
Thank you!
Assuming df is your dataframe the following should give you the long form summary.
df %>%
arrange(game,desc(total)) %>% #we sort descending to ensure winner row is first.
group_by(game) %>% # we group the rows per game, this allows for winner row to be first in each group
slice_head(n=1)%>% #get first row in each group i.e winner row
ungroup()
Output:
# A tibble: 5 × 3
game user total
<int> <chr> <int>
1 1 J 60
2 2 J 77
3 3 L 90
4 4 L 98
5 5 J 92
If you just want the user wise summary add count to the code as follows:
df %>%
arrange(game,desc(total)) %>% #we sort descending to ensure winner row is first.
group_by(game) %>% # we group the rows per game, this allows for winner row to be first in each group
slice_head(n=1) %>% #get first row in each group i.e winner row
ungroup() %>%
count(user)
Output:
# A tibble: 2 × 2
user n
<chr> <int>
1 J 3
2 L 2