Search code examples
pythonmachine-learningxgboost

What does lags mean exactly in XGBoost?


I'm trying to make a time series forecast using XGBoost, I didn't understood very well the meaning of lags:

  1. What does it do and meaning of it?

  2. what is the best way to choose the more efficient lag parameters?


Solution

  • Lags in the context of forecasting are data points that happened before a fixed amount of time.

    For example:

    You have a dataset with monthly data, let's say from 01/09/2023 to 30/09/2023;
    lag = 7d would mean data points from 25/08/2023 to 23/09/2023 are used to predict the data in September.

    Answering your 2nd question, what is the best way to choose -- 15m, 30m, 1h, 6h, 12h, 1d, 7d, 30d, 60d, 90d, and even 180d lags, depending on the frequency of the target variable.