Say that I want to regress the predictive model: Return_t = x + Volume_t-1 + Volatility_t-1 + e. I have a 5-year weekly panel data with 28 companies already prepared in excel and looks like this:
ID Date Return Volume Volatility
1 2012-01-10 0.039441572 0.6979594 0.2606079
1 2012-01-17 -0.021107681 0.6447289 0.3741519
1 2012-01-24 0.004798082 1.0072677 0.3097104
1 2012-01-31 0.001559987 1.0066153 0.2761096
1 2012-02-07 -0.009058289 0.7218983 0.2592109
1 2012-02-14 0.046404936 1.2879986 0.4304542
2 2012-01-10 0.02073912 -0.141970906 0.2573633
2 2012-01-17 -0.00369127 0.007792180 0.3360240
2 2012-01-24 -0.05881038 0.001347634 0.2163933
2 2012-01-31 -0.05664598 0.640085029 0.3545598
2 2012-02-07 0.03654193 0.360513703 0.3594383
2 2012-02-14 0.03092432 0.105669775 0.3043643
I want to lag the independent variables setting it to t-1, which package allows me to do that in R? I am going to run a panel data regression with fixed effects.
After grouping by 'ID', we can use lag
from dplyr
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(Volume_1 = lag(Volume), Volatility_1 = lag(Volatility))
Or another option is shift
from data.table
library(data.table)
nm1 <- c("Volume", "Volatility")
setDT(df1)[, paste0(nm1, "_1") := lapply(.SD, shift), by = ID, .SDcols = nm1]