I am looking for a way of changing structure of DF so I can use beta regression after. The df looks like this at the moment:
rating playerID
0.6 a1
NA b2
0.9 a4
NA b5
0 a3
NA b2
I need to make it look this way:
rating a1 a2 a3 a4 a5 b1 b2 b3 b4 b5
0.6 1 0 0 0 0 0 -1 0 0 0
0.9 0 0 0 1 0 0 0 0 0 -1
0 0 0 1 0 0 0 -1 0 0 0
It not necessary to have the -1 (1 works as well) by the "bX" variables. The idea behind is to take pairs (player "aX" and "bX") and encode them as dummy variables with the rating of player "aX" at the same line.
Thank you for any ideas and inputs.
Here's a base R solution using table
, assuming the factor levels a1
to b5
are already present in playerID
:
table(subset(DF, grepl("a", playerID))) -
table(subset(within(DF, rating <- dplyr::lag(rating)), grepl("b", playerID)))
#> playerID
#> rating a1 a2 a3 a4 a5 b1 b2 b3 b4 b5
#> 0 0 0 1 0 0 0 -1 0 0 0
#> 0.6 1 0 0 0 0 0 -1 0 0 0
#> 0.9 0 0 0 1 0 0 0 0 0 -1