Given the data.frame
below, how can I obtain a genre history (by player) for each game deemed interesting?
player <- c(rep (1,3), rep(2,4))
game <- c(seq(1:3), seq(1:4))
genre <- c("JnR", "Ego", "RPG", "RPG", "Sports", "JnR", "Sim")
interesting <- c("no", rep(c("no","yes"),3))
playerhist <- data.frame (player, game, genre, interesting)
> playerhist
player game genre interesting
1 1 1 JnR no
2 1 2 Ego no
3 1 3 RPG yes
4 2 1 RPG no
5 2 2 Sports yes
6 2 3 JnR no
7 2 4 Sim yes
Desired output:
player game genre_history
1 1 1 JnR
2 1 2 Ego
3 2 1 RPG
4 2 1 RPG
5 2 2 Sports
6 2 3 JnR
So for each interesting game I'd like to include all the previous rows for the corresponding player. Including information from the 'interesting' column is not required as shown in the desired output, but if solutions include it anyway, that would also be fine. This column would read c('no','no','no','no','yes','no')
.
A solution using dplyr
would be preferred.
I finally found the solution. Here it is for those trying to solve a similar problem:
library(dplyr)
#-------------------------------
# #1 number of interesting games for each player
# #2 splitting df into list of dfs for each player
# #3 copy & append df for -> max(.$int_game) times
# #4 numbering of dfs using a new number every time game == 1
# #5 splitting df into list of dfs for each player & df_num
# #6 count occurrences of interesting == 'yes'
# #7 set cutpoint to slice determined by cut1 == df_num for the first time
# #8 slice from row 1:row at which cut2 == 1 and omit the last row
#-------------------------------
gamehist <- playerhist %>%
group_by (player) %>%
mutate (int_game = cumsum(str_detect(interesting, "yes"))) %>% #1
ungroup () %>%
base::split (., .$player, drop = FALSE) %>% #2
lapply (., function (df) {
df %>%
.[rep(1:nrow(.), max(.$int_game)),] %>% #3
mutate (df_num = cumsum(.$game == 1)) %>% #4
return ()
}) %>%
bind_rows () %>%
base::split (., list(.$player, .$df_num), drop = TRUE) %>% #5
lapply (., function (df2) {
df2 %>%
mutate (cut1 = cumsum (str_detect(interesting, "yes")), #6
cut2 = if_else (cut1 == df_num, 1, 0)) %>% #7
slice (1:which(cut2 == 1)-1) %>% #8
return ()
}) %>%
bind_rows () %>%
select (player, game, genre)
gamehist
# A tibble: 6 × 3
player game genre
<dbl> <int> <chr>
1 1 1 JnR
2 1 2 Ego
3 2 1 RPG
4 2 1 RPG
5 2 2 Sports
6 2 3 JnR