Is there a way to specify lagged independent variable in statsmodel ols regression? Here's a sample dataframe and ols model specification below. I'd like to include a lagged variable in model.
df = pd.DataFrame({
"y": [2,3,7,8,1],
"x": [8,6,2,1,9],
"v": [4,3,1,3,8]
})
Current model:
model = sm.ols(formula = 'y ~ x + v', data=df).fit()
Desired model:
model_lag = sm.ols(formula = 'y ~ (x-1) + v', data=df).fit()
I don't think you can call it on the fly in the formula. Maybe using the shift method? Do clarify if this is not what you need
import statsmodels.api as sm
df['xlag'] = df['x'].shift()
df
y x v xlag
0 2 8 4 NaN
1 3 6 3 8.0
2 7 2 1 6.0
3 8 1 3 2.0
4 1 9 8 1.0
sm.formula.ols(formula = 'y ~ xlag + v', data=df).fit()