I have time series data as data.table class and each column (observation points) has values that I want to count them within sliding window (30 width). I tried to use rle(sort(x)) to count each values within rollapply but it's not working.
for example if I have table like below,
dt <- data.frame(v1=c(1,0,1,4,4,4,4,4),v2=c(1,1,1,4,3,3,3,3),
v3=c(0,1,1,3,3,3,3,2),v4=c(1,1,0,3,3,3,3,3),
v5=c(1,1,1,5,5,5,5,5))
I tried like this;
rollapply(dt, 3, function(x) {rle(sort(x))$values; rle(sort(x))$length})
but the result is just doesn't make sense. please give me some direction...
Solution 1 Assuming the objective is to get rolling counts of 3 values try the following:
m <- as.matrix(dt)
levs <- sort(unique(c(m)))
f <- function(x) table(factor(x, levs))
r <- rollapply(m, 3, f)
Here levs
is 0, 1, ..., 5 so for each application of the function we will get out a vector 6 long witih a count of the 0's, 1's, ..., 5's. There are 5 input columns so applying such a function to each column gives 5 * 6 = 30 columns of output.
Note that rollapply
works with matrices or zoo objects, not data frames, so we converted it. Also to ensure that each function application outputs a vector of the same length we convert each input to a factor with the same levels.
Note that:
ra <- array(r, c(6, 6, 5))
gives a 3d array in which ra[,,i] is the matrix formed by rollapply(dt[, i], 3, f)
. That is, in the matrix ra[,,i]
there is a row for each application of f
on column i and the columns in that row count the number of 0's, 1's, ..., 5's.
Another possibility is this which gives the same 5 matrices (one per input column) as components of the resulting list:
lapply(dt, rollapply, 3, f)
For example, consider the following. Row 1 of the output says that the first application of f on dt[,1]
has one 0, two 1s and no other values. This can also be obtained from r[,,1]
or from
lapply(dt, rollapply, 3, f)[[1]]
:
> rollapply(dt[, 1], 3, f)
0 1 2 3 4 5
[1,] 1 2 0 0 0 0 <- dt[1:3,1] has 1 zero and 2 ones
[2,] 1 1 0 0 1 0 <- dt[2:4,1] has 1 zero and 1 one and 1 four, etc.
[3,] 0 1 0 0 2 0
[4,] 0 0 0 0 3 0
[5,] 0 0 0 0 3 0
[6,] 0 0 0 0 3 0
Solution 2
This says looking at cell 1,1 of the output that the there is one 0 and two 1s in dt[1:3,1]
. Looking at cell 2,1 of the output we see that there is one 0, one 1 and 1 four in dt[2:4,1]
, etc.
> g <- function(x) { tab <- table(x); toString(paste(names(tab), tab, sep = ":")) }
> sapply(dt, rollapply, 3, g) # or rollapply(m, 3, g) where m was defined in solution 1
v1 v2 v3 v4 v5
[1,] "0:1, 1:2" "1:3" "0:1, 1:2" "0:1, 1:2" "1:3"
[2,] "0:1, 1:1, 4:1" "1:2, 4:1" "1:2, 3:1" "0:1, 1:1, 3:1" "1:2, 5:1"
[3,] "1:1, 4:2" "1:1, 3:1, 4:1" "1:1, 3:2" "0:1, 3:2" "1:1, 5:2"
[4,] "4:3" "3:2, 4:1" "3:3" "3:3" "5:3"
[5,] "4:3" "3:3" "3:3" "3:3" "5:3"
[6,] "4:3" "3:3" "2:1, 3:2" "3:3" "5:3"
ADDED: Additional discussion and solution 2.