Search code examples
sqlgoogle-bigqueryaveragemoving-averagerolling-average

SQL/BIGQUERY Running Average with GAPs in Dates


I'm having trouble with a moving average in BigQuery/SQL, I have table 'SCORES' and I need to make a 30d moving average while grouping the data using users, the problem is my dates aren't sequential, e.g there are gaps in it.

Below is my current code:

SELECT user, date,
      AVG(score) OVER (PARTITION BY user ORDER BY date)
FROM SCORES;

I don't know how to add the date restrictions into that line or if this is even possible.

My current table looks like this, but of course with a lot more users:

user    date    score
AA  13/02/2018  2.00
AA  15/02/2018  3.00
AA  17/02/2018  4.00
AA  01/03/2018  5.00
AA  28/03/2018  6.00

Then I need it to become, this:

user    date    score   30D Avg
AA  13/02/2018  2.00    2.00
AA  15/02/2018  3.00    2.50
AA  17/02/2018  4.00    3.00
AA  01/03/2018  5.00    3.50
AA  28/03/2018  6.00    5.50

Where in the last row, it's only measuring backward one because of the date (up to 30D backwards) is there any way to implement this in SQL or am I asking for too much?


Solution

  • You want to use range between. For this, you need an integer, so:

    select s.*,
           avg(score) over (partition by user
                            order by days
                            range between 29 preceding and current row
                           ) as avg_30day
    from (select s.*, date_diff(s.date, date('2000-01-01'), day) as days
          from scores s
         ) s;
    

    An alternative to date_diff() is unix_date():

    select s.*,
           avg(score) over (partition by user
                            order by unix_days
                            range between 29 preceding and current row
                           ) as avg_30day
    from (select s.*, unix_date(s.date) as unix_days
          from scores s
         ) s;