I'm relatively new to R and need some help.
I want to generate a 2x2 contingency table and always compare two consecutive days as follow: you start with the first row and take the median of the values, in my example in the row 2019-02-11 it would be the value 15 from x4. Now all values higher than the median are "winners" and and values below are "losers".
In the second step you do the same for the row 2019-02-12 and if x1 for example is in both periods a "winner" it should count in a field "winner/winner" in the contingency table. Same for "loser/loser", "winner/loser", "loser/winner", and so on for all consecutive dates.
I have a data frame with around 200 columns and therefore looking for a effective way to do this.
My code looks like this:
set.seed(123)
d <- data.frame(Time = rep(seq.Date( Sys.Date(), length=30, by="day" )),
x1 = rep(sample(10:30, 10), 3),
x2 = rep(sample(10:30, 10), 3),
x3 = rep(sample(10:30, 10), 3),
x4 = rep(sample(10:30, 10), 3),
x5 = rep(sample(10:30, 10), 3))
Thanks a lot.
With a bit of arithmetic I think we can do this quite efficiently.
First we find the winners and losers and assign them a 0
or a 1
accordingly. Next we can do columnwise differencing to find out wether two consecutive days were lose/win (1) or win lose (-1). As both win/win and lose/lose will result in a difference of zero, we'l also have to check what the first value was. The rest is just recoding and assembling.
d <- structure(list(Time=structure(17942:17947, class="Date"),
x1=c(NA, NA, 17L, 29L, 27L, 10L), x2=c(30L, 19L, 22L, 20L, 11L,
24L), x3=c(NA, 23L, 22L, 27L, 21L, 26L), x4=c(30L, 28L, 23L,
24L, 10L, 17L), x5=c(12L, 18L, 17L, 16L, 30L, 26L)),
row.names=c(NA, 6L), class="data.frame")
x <- t(apply(d[,-1], 1, function(x) x > median(x, na.rm=TRUE)))
nr <- nrow(x)
dx <- diff(x)
lw <- (dx == 1)*1
wl <- (dx == -1)*2
dd <- (dx == 0)
ww <- (dd & x[-nr,] == 1)*3
ll <- (dd & x[-nr,] == 0)*4
tab <- c("lose/win", "win/lose", "win/win", "lose/lose")[lw + wl + ww + ll]
d0 <- d
d0[-1,-1] <- tab
d0
# Time x1 x2 x3 x4 x5
# 1 2019-02-15 <NA> 30 <NA> 30 12
# 2 2019-02-16 <NA> lose/lose <NA> lose/win lose/lose
# 3 2019-02-17 <NA> lose/lose win/lose win/win lose/lose
# 4 2019-02-18 lose/win lose/lose lose/win win/lose lose/lose
# 5 2019-02-19 win/win lose/lose win/lose lose/lose lose/win
# 6 2019-02-20 win/lose lose/lose lose/win lose/lose win/win