I have 3 columns of data with 10 rows in each column as below
set.seed(101)
inputx <- rnorm(1000,mean = 3,sd=2)
inputy <- rnorm(1000,mean = 2,sd=1)
inputz <- rnorm(1000,mean = 1,sd=3)
example <- cbind(inputx,inputy,inputz)
> head(example,10)
inputx inputy inputz
[1,] 2.347927 2.50319581 4.4931430
[2,] 4.104924 -0.09747067 -0.2836938
[3,] 1.650112 1.90551542 0.9146087
[4,] 3.428719 3.01454241 4.9332929
[5,] 3.621538 1.92059955 2.4220865
[6,] 5.347933 1.74487106 2.9122764
[7,] 4.237580 2.78095054 7.8622898
[8,] 2.774531 3.20741266 -1.5977934
[9,] 4.834057 1.09214734 -0.5482315
[10,] 2.553481 0.59679215 0.5285020
My actual data has 10 variables in total but for simplicity of this example I just used 3.
For each permutation as per below I would like to calculate the ccf using a rolling window of size 4
inputx,inputx
inputx,inputy
inputx,inputz
inputy,inputx
inputy,inputy
inputy,inputz
inputz,inputx
inputz,inputy
inputz,inputz
e.g
ccf(example[1:4,1],example[1:4,2])
ccf(example[2:5,1],example[2:5,2])
.
.
.
ccf(example[7:10,1],example[7:10,2])
As you can see this sample above just works on columns 1 and 2 but I hope to do it for all columns using the rolling window approach.
The rolling window procedure is easily handled using the rollapply function where you specify the size of the window and by specifying by.column as FALSE so that it doesn't apply the function on each column separately.
If I wanted to roll apply a function such as mean to each column I could do that but the looping across columns for all combinations just blows my mind and I can't figure it out.
test <- rollapply(example[,c(1,2)],4,mean, by.column=TRUE)
For the output the ccf results should be stored by row and should be of dimensions 10x9 since there's 10 rows in the original data and the 9 permutations of the ccf function. The first 3 rows of the output will be NA since the rolling window uses a size of 4. The values below are just for illustrative purposes and are not the real output values.
output ->
xx xy xz yx yy yz zx zy zz
[1,] NA NA NA NA NA NA NA NA NA
[2,] NA NA NA NA NA NA NA NA NA
[3,] NA NA NA NA NA NA NA NA NA
[4,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[5,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[6,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[7,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[8,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[9,] .1 .2 .3 .2 .8 .5 .3 .5 .9
[10,].1 .2 .3 .2 .8 .5 .3 .5 .9
I would appreciate a little help in applying the looping across all permutations, I think if I got that I could then do the rollapply wrapper to implement the sliding window.
1) ccf(x, y)
of two 4-vectors x
and y
gives a 7-vector so the output would have 3 * 3 * 7 = 63 columns, not 9, as stated in the question.
In a comment the poster stated that another function could be substituted for ccf
so below we assume cov(x, y)
since that outputs a scalar rather than a 7-vector and so would give a 10 x 9 output. In this particular case cov(cbind(x, y, z))
produces a 3x3 matrix which when flattened gives a 9-vector.
rollapplyr(head(example, 10), 4, function(x) c(cov(x)), fill = NA, by.column = FALSE)
giving the following 10x9 matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] NA NA NA NA NA NA NA NA NA
[2,] NA NA NA NA NA NA NA NA NA
[3,] NA NA NA NA NA NA NA NA NA
[4,] 1.1990739 -0.72070179 -0.3951435 -0.72070179 1.8590569 3.1565993 -0.3951435 3.1565993 6.718376
[5,] 1.1503463 -0.51712419 0.1548365 -0.51712419 1.6830055 2.6102211 0.1548365 2.6102211 5.058550
[6,] 2.2854029 -0.12857123 1.1658204 -0.12857123 0.3413027 0.7821381 1.1658204 0.7821381 2.753662
[7,] 0.7473036 -0.31336885 -0.2743693 -0.31336885 0.3923239 1.1959920 -0.2743693 1.1959920 6.109035
[8,] 1.1727627 -0.53344663 2.2960862 -0.53344663 0.4851109 -0.5067012 2.2960862 -0.5067012 15.027672
[9,] 1.2381071 -0.88053417 1.5728089 -0.88053417 0.9289009 0.7283704 1.5728089 0.7283704 18.179175
[10,] 1.2353345 -0.05021654 1.7008923 -0.05021654 1.6116281 1.4902571 1.7008923 1.4902571 18.399713
2) or this which gives the same result:
k <- ncol(example)
g <- expand.grid(1:k, 1:k)
Cov <- function(x) apply(g, 1, function(ix) cov(x[, ix[1]], x[, ix[2]]))
rollapplyr(head(example, 10), 4, Cov, by.column = FALSE, fill = NA)
Note that in the case of cov
it produces a symmetric matrix (the 3 lower triangular values equal the 3 upper triangular values) so we might only wish to output the diagonal and upper triangular part and if that is the case we could use upper.tri
to subset it.