Every time a player changes I need subtotals of how many strikouts he had in his career.
I have tried doing it using the code below but was not getting subtotals.
player <- c('acostma01', 'acostma01', 'acostma01', 'adkinjo01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01', 'aguilri01')
year <- c(2010,2011,2012,2007,1985,1986,1987,1988,1989)
games <- c(41,44,45,1,21,28,18,11,36)
strikeouts <- c(42,46,46,0,74,104,77,16,80)
bb_data <- data.frame(player, year, games, strikeouts, stringsAsFactors = FALSE)
Here is code that did not work.
mets <- select(bb_data, player, year, games, strikeouts) %>%
group_by(player, year) %>%
colSums(SO)
Here is the output I would like to get:
player games strikeouts
acostma01 130 134
adkinjo01 1 0
aguilri01 0 351
Grand Total 485
Here is what I was getting (tail of data):
player team year games strikouts
<chr> <chr> <int> <int> <int>
swarzan01 NYN 2018 29 31
syndeno01 NYN 2018 25 155
vargaja01 NYN 2018 20 84
wahlbo01 NYN 2018 7 7
wheelza01 NYN 2018 29 179
zamorda01 NYN 2018 16 16
You could do:
library(tidyverse)
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
This would give you:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 acostma01 130 134
2 adkinjo01 1 0
3 aguilri01 114 351
4 Grand Total NA 485
Which is consistent with all values except games
for aguilri01
- I presume it is a typo, but let me know if this is incorrect.
For descending order, you could do:
bb_data %>%
group_by(player) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
add_row(player = 'Grand Total', games = NA, strikeouts = sum(.$strikeouts))
Output:
# A tibble: 4 x 3
player games strikeouts
<chr> <dbl> <dbl>
1 aguilri01 114 351
2 acostma01 130 134
3 adkinjo01 1 0
4 Grand Total NA 485
To also include the seasons played, you can try:
bb_data %>%
group_by(player) %>%
mutate(seasons_played = n_distinct(year)) %>%
group_by(player, seasons_played) %>%
summarise_at(vars(games, strikeouts), sum) %>%
arrange(-strikeouts) %>%
ungroup() %>%
add_row(player = 'Grand Total', games = NA, seasons_played = NA, strikeouts = sum(.$strikeouts))