I'm hoping to calculate the coefficients (the intercepts in particular) of rolling regressions. There are many dependent variables. Some of them (Y1 and Y2) are shown below. Each of them is regressed with independent variables X1 and X2. Additionally, Y1 and Y2 both have NAs in different periods. The data is a time series with monthly interval. The rolling window is 6.
This is my code:
rr <- rollapply(df, width = 6,
FUN = function(z) coef(lm(Y1~ X1+X2,
data = as.data.frame(z))),
by.column = FALSE, align = "right")
However, the problem of this code is that
1) it only deals with one independent variable (Y1 in this case) at once,
2) it gave the same coefficients for all rolling regressions. I assume the existence of NA messed up the rolling regression?
I would greatly appreciate if someone can shed some lights. Thanks.
This is a sample data.
Date Y1 Y2 X1 X2
1/1/2009 NA 1.51 0.02 0.75
2/1/2009 NA -0.38 0.01 0.59
3/1/2009 NA 1.54 0.02 0.96
4/1/2009 NA 1.78 0.01 0.92
5/1/2009 NA 0.94 0.02 0.02
6/1/2009 NA 1.37 0.01 0.46
7/1/2009 NA 1.22 0.01 0.61
8/1/2009 NA 1.32 0.01 0.04
9/1/2009 NA 0.83 0.01 0.03
10/1/2009 NA 0.95 0.02 0.61
11/1/2009 NA 0.28 0.03 0.53
12/1/2009 NA 0.17 0.01 0.32
1/1/2010 1.71 NA 0.03 0.53
2/1/2010 0.39 NA 0.03 0.16
3/1/2010 0.11 NA 0.01 0.58
4/1/2010 1.25 NA 0.01 0.41
5/1/2010 0.57 NA 0.01 0.9
6/1/2010 0.48 NA 0.01 0.58
7/1/2010 0.16 NA 0.01 0.03
8/1/2010 0.37 NA 0.01 0.23
9/1/2010 0.31 NA 0.01 0.77
10/1/2010 0.63 NA 0.01 0.75
11/1/2010 0.61 NA 0.01 0.74
12/1/2010 0.91 NA 0.01 0.41
There are two problems:
passing a data.frame to rollapply
causes it to be converted to a matrix and since one of the columns is character column the result will be a character matrix whereas numeric is what is needed. Use df[-1]
or the code shown below.
lm
does not accept a dependent variable having all NA values. Check for that and return NA in such cases.
Adding a few improvements:
getCoef
to get the coefficients given the data and the left and right side of the formularoll
to do the actual rollapply
using rollapplyr
lapply
the function roll
over c("Y1", "Y2")
to produce a list of 2 zoo objectsfortify.zoo
over L
to give a list of data framesCode:
z <- read.zoo(df, FUN = as.yearmon, format = "%m/%d/%Y")
getCoef <- function(z, lhs, rhs) {
if (all(is.na(z[, lhs]))) NA
else coef(lm(paste(lhs, "~", rhs), z))
}
roll <- function(z, lhs, rhs = "X1 + X2") {
rollapplyr(z, 6, getCoef, by.column = FALSE, coredata = FALSE, lhs = lhs, rhs = rhs)
}
ynames <- c("Y1", "Y2")
L <- lapply(ynames, roll, z = z)
Optionally, for a list of data.frames:
Map(fortify.zoo, L)
Note: The input df
in reproducible form is:
Lines <- "Date Y1 Y2 X1 X2
1/1/2009 NA 1.51 0.02 0.75
2/1/2009 NA -0.38 0.01 0.59
3/1/2009 NA 1.54 0.02 0.96
4/1/2009 NA 1.78 0.01 0.92
5/1/2009 NA 0.94 0.02 0.02
6/1/2009 NA 1.37 0.01 0.46
7/1/2009 NA 1.22 0.01 0.61
8/1/2009 NA 1.32 0.01 0.04
9/1/2009 NA 0.83 0.01 0.03
10/1/2009 NA 0.95 0.02 0.61
11/1/2009 NA 0.28 0.03 0.53
12/1/2009 NA 0.17 0.01 0.32
1/1/2010 1.71 NA 0.03 0.53
2/1/2010 0.39 NA 0.03 0.16
3/1/2010 0.11 NA 0.01 0.58
4/1/2010 1.25 NA 0.01 0.41
5/1/2010 0.57 NA 0.01 0.9
6/1/2010 0.48 NA 0.01 0.58
7/1/2010 0.16 NA 0.01 0.03
8/1/2010 0.37 NA 0.01 0.23
9/1/2010 0.31 NA 0.01 0.77
10/1/2010 0.63 NA 0.01 0.75
11/1/2010 0.61 NA 0.01 0.74
12/1/2010 0.91 NA 0.01 0.41"
df <- read.table(text = Lines, header = TRUE)