Search code examples
rcorrelationrollapply

correlation coefficients between two time series calculated over windows moved forward in time by n time unit


Is there a package or a simple code to produce plots of (1) correlation coefficients between two time series calculated over windows moved forward in time by n time unit (2) and their respective p-values calculated for each move ?

library(zoo)

x = ts(rnorm(1:121), start = 1900, end = 2021)
y = ts(rnorm(1:121), start = 1900, end = 2021)
data = data.frame(x, y)

# 40-year moving window lagged forward by 15 years per example

rollapply(data, width=40, by = 15, 
          function(x) cor(x[,1],x[,2], method =  "pearson"),
          by.column=FALSE)

[1]  0.92514750  0.5545223 -0.207100231 -0.119647462 -0.125114237  0.041334073

**It would be better with Hmisc::rcorr which also calculates p-values but I didn't manage to integrate it in rollapply.

In the result here, the first coefficient (0.9251...) is valid for 1900:1940, the second one is valid for 1915:1955 etc.

So the question is: is there a quick way to integrate this result into a staircase graph with time, r and p-value?

The output would look like:

Time r P
1900 0.92 0.000001
1901 0.92 0.000001
... ... ...
1915 0.55 0.00045
1916 0.55 0.00045

Solution

  • A few points:

    • there are 2021-1900+1 = 122 years from 1900 to 2021 inclusive, not 121
    • the 40/15 parameters do not evenly work with 122 points so start at 1907

    rcorr returns a list of 3 components and we want the 1,2 elements of each. We can fill in the missing values from rollapplyr using na.locf. The input and output are both mts/ts series.

    library(zoo)
    library(Hmisc)
    
    set.seed(123)
    tt <- ts(cbind(x = rnorm(115), y = rnorm(115)), start = 1907)
    
    na.locf(rollapplyr(tt, width=40, by = 15, 
              function(x) sapply(rcorr(x), `[`, 1, 2),
              by.column = FALSE, fill = NA), fromLast = TRUE)
    

    The above returns a series with the same number of rows as the input tt but based on computing rcorr for the following ranges of years:

    rollapplyr(1907:2021, 40, by = 15, range)
    ##      [,1] [,2]
    ## [1,] 1907 1946
    ## [2,] 1922 1961
    ## [3,] 1937 1976
    ## [4,] 1952 1991
    ## [5,] 1967 2006
    ## [6,] 1982 2021