Search code examples
timegoogle-bigquerytime-seriesregressionprojection

Time Series Projection in Google BigQuery


I was looking for a good way to do time series projection in BigQuery, and found this one, which nicely works to calculate correlations and slope: View Post. But it doesn't help to extend the timeline to your choice.

But can anyone please suggest a complete solution, where I can extend the timeline (x) according to my need and get projections of (Y) using a single query?

Any help will be highly appreciated.


Solution

  • The basic idea is to left join the model specs to a generated dates table and use it:

    WITH stats AS (
      SELECT * FROM UNNEST([
        STRUCT( 'a' AS model, 0.3 AS slope, 11 AS intercept ),
        STRUCT( 'b', 0.2, 7)
      ])
    )
    
    SELECT
      date,
      model,
      slope,
      intercept,
      UNIX_DATE(date) AS X,
      slope * UNIX_DATE(date) + intercept AS Y
    FROM
      UNNEST(GENERATE_DATE_ARRAY(DATE('2018-05-01'),DATE('2018-07-01'))) AS date
      LEFT JOIN stats ON TRUE
    ORDER BY date ASC
    

    I did not repeat the statistics part since it is already answered, but I created a dummy table with two models which replaces it, The model can also be a bucket of course, then you'd have to left join on that as a key.

    I'm also assuming you created the model with dates using unix date (days since 1970-01-01), if not you need to modify accordingly.