I am trying to compute stats for basketball lineups given the play-by-play data. In particular I have a dataframe which more or less looks like this
game_id | action_id | team_id | event | p1 | p2 | p3 | p4 | p5 |
---|---|---|---|---|---|---|---|---|
G1 | 1 | H | 3PA | A | B | C | D | E |
G1 | 2 | V | REB | A | B | C | D | E |
I would like to make a sort of pivot_longer
but instead of having a column with the names (p1:p5) and the values (A:E), I am looking for a function which allows me to find all combinations of k players ( 1 <= k <= 5 ). For instance, if k = 2, it would be something like:
game_id | action_id | team_id | event | p_comb1 | p_comb2 |
---|---|---|---|---|---|
G1 | 1 | H | 3PA | A | B |
G1 | 1 | H | 3PA | A | C |
G1 | 1 | H | 3PA | A | D |
G1 | 1 | H | 3PA | A | E |
G1 | 1 | H | 3PA | B | C |
G1 | 1 | H | 3PA | B | D |
G1 | 1 | H | 3PA | B | E |
G1 | 1 | H | 3PA | C | D |
G1 | 1 | H | 3PA | C | E |
G1 | 1 | H | 3PA | D | E |
G1 | 2 | V | REB | A | B |
G1 | 2 | V | REB | A | C |
G1 | 2 | V | REB | A | D |
G1 | 2 | V | REB | A | E |
G1 | 2 | V | REB | B | C |
G1 | 2 | V | REB | B | D |
G1 | 2 | V | REB | B | E |
G1 | 2 | V | REB | C | D |
G1 | 2 | V | REB | C | E |
G1 | 2 | V | REB | D | E |
In particular my requirements would be to:
lapply
and for cycles where possibleDoes a function like even this exists?
I tried with self joining, which is in fact a possible solution, even if it doesn't look very elegant.
You could do:
df %>%
mutate(comb = apply(across(num_range('p', 1:5)), 1, function(x) as.data.frame(t(combn(x, 2))))) %>%
unnest_longer(comb) %>%
unpack(comb)