I would like to compute a variant of rolling medians on my dataset that does build the subsets not by going k
observerations to the front and back, but by taking all observations into account that are in a given time window.
A straightforward implemtation could look like this:
windowwidth <- 30
median.window <- function(x) median(mydata[time <= x + windowwidth /2 & time >= x - windowwidth /2)
vapply(time, median.window)
However, as you can imagine, this is not very efficient for large datasets. Do you see a possible improvement or a package providing an optimized implementation? You can not expect the observations be distributed equally over time.
zoo
provides rollmedian
, but this function does not offer to choose the winwod based on time but on the observation count.
Ok, try this:
Rgames: timeseq<-1:5
Rgames: winmat <- outer(timeseq,timeseq,FUN=function(x,y) y>=x &y<=x+2)
Rgames: winmat
[,1] [,2] [,3] [,4] [,5]
[1,] TRUE TRUE TRUE FALSE FALSE
[2,] FALSE TRUE TRUE TRUE FALSE
[3,] FALSE FALSE TRUE TRUE TRUE
[4,] FALSE FALSE FALSE TRUE TRUE
[5,] FALSE FALSE FALSE FALSE TRUE
Rgames: winmat %*% timeseq
[,1]
[1,] 6
[2,] 9
[3,] 12
[4,] 9
[5,] 5
Replace that function with your window width and I think you'll be all set.
Edit: In respons to Thilo's query, it looks like in the general case you should use apply
. Given the stuff above, call your observation values "timval", as
Rgames: timval<-c(3,4,2,6,1)
Rgames: valmat<-timval*t(winmat)
Rgames: valmat
[,1] [,2] [,3] [,4] [,5]
[1,] 3 0 0 0 0
[2,] 4 4 0 0 0
[3,] 2 2 2 0 0
[4,] 0 6 6 6 0
[5,] 0 0 1 1 1
Rgames: apply(valmat,2,median)
[1] 2 2 1 0 0
Edit again: clearly I was asleep there: nobody wants a median based on all those zeroes. I should think more before posting. Add this:
valmat[valmat==0]<- NA
apply(valmat,2, median, na.rm=T)
[1] 3.0 4.0 2.0 3.5 1.0
And I'm sure there's a cleaner way of 'building' valmat
than this, but the final result is the "filter matrix" you want to apply any function to.