I'm trying to make a time series forecast using XGBoost, I didn't understood very well the meaning of lags:
What does it do and meaning of it?
what is the best way to choose the more efficient lag parameters?
Lags
in the context of forecasting are data points that happened before a fixed amount of time.
For example:
You have a dataset with monthly data, let's say from 01/09/2023 to 30/09/2023;
lag = 7d
would mean data points from 25/08/2023 to 23/09/2023 are used to predict the data in September.
Answering your 2nd question, what is the best way to choose -- 15m
, 30m
, 1h
, 6h
, 12h
, 1d
, 7d
, 30d
, 60d
, 90d
, and even 180d
lags, depending on the frequency of the target variable.