Search code examples
sqlgoogle-bigquery

Google BigQuery SQL: How to fill in gaps in a table with dates?


Data Table:

Dates Customer Subscription
20/02/2020 A free
21/02/2020 A free
22/02/2020 A free
23/02/2020 B free
23/03/2020 A full
01/03/2020 B full
01/03/2020 A full
02/03/2020 A full

Need to fill gaps in dates by the value in the previous date

Output:

Dates Customer Last Subscription
20/02/2020 A free
21/02/2020 A free
22/02/2020 A free
23/03/2020 A full
23/03/2020 B free
24/02/2020 A full
24/02/2020 B free
25/02/2020 A full
25/02/2020 B free
26/02/2020 A full
26/02/2020 B free
27/02/2020 A full
27/02/2020 B free
28/02/2020 A full
28/02/2020 B free
01/03/2020 A full
01/03/2020 B full
02/03/2020 A full
02/03/2020 B full

I found a similar solution Duplicate groups of records to fill multiple date gaps in Google BigQuery, but it is not suitable because in my example each Customer has a different start date.


Solution

  • Consider below

    with temp as (
      select customer, dates from (    
        select customer, min(dates) min_date, max(dates) max_date
        from `project.dataset.table`
        group by customer
      ), unnest(generate_date_array(min_date, max_date)) dates
    )
    select customer, dates, 
      first_value(subscription ignore nulls) over win as subscription
    from temp a
    left join `project.dataset.table` b
    using(customer, dates)
    window win as (partition by customer order by dates desc rows between current row and unbounded following)
    # order by dates, customer          
    

    If to apply to sample data in y our question - output is

    enter image description here