Search code examples
sqlgoogle-bigquerylead

Splitting a value in to current date and next date with SQL in Big Query


I want to split the sales data of one day up into that specific day and the next day with SQL. In the first table underneath you can see the data I have and in the second table an example of the data I want to have.

Table 1: Current situation

|  Date     |   Country   | discount_code    | sales | 
|-----------|-------------|------------------|-------|
|2020-12-03 |   US        | Instagram        | 300   | 
|2020-12-03 |   UK        | Google           | 200   | 
|2020-12-02 |   US        | Instagram        | 500   | 
|2020-12-02 |   UK        | facebook         | 600   | 

Table 2: Required situation

|  Date     |   Country   | discount_code    | sales1 | sales2 | 
|-----------|-------------|------------------|--------|--------|
|2020-12-04 |   US        | Instagram        | 0      | 150    |
|2020-12-04 |   UK        | Google           | 0      | 100    |
|2020-12-03 |   US        | Instagram        | 150    | 250    |
|2020-12-03 |   UK        | Google           | 100    | 0      |
|2020-12-03 |   UK        | facebook         | 0      | 300    |
|2020-12-02 |   US        | Instagram        | 250    | 0      |
|2020-12-02 |   UK        | facebook         | 300    | 0      |

I tried to solve it myself in the following query:

SELECT 
date,
country,
discount_code,
sum(sales/2) AS sales1,
LEAD(sum(sales/2), 1) OVER (ORDER BY date, country, discount_code) AS sales2,
FROM `source`
GROUP BY 1,2,3
ORDER BY date DESC

This code only assigns the sales2 to be assigned to the row above. However, I want sales2 to be specific for date, country and discount_code. This would entail that when such a row does not already exist, it is created.

I hope anyone has a solution for this. It would help me enormously. Thanks in advance!


Solution

  • Below is for BigQuery Standard SQL

    #standardSQL
    select t.date, t.country, t.discount_code,
      sum(sales1) as sales1, sum(sales2) as sales2
    from `project.dataset.source`, 
    unnest([struct
      (date, country, discount_code, sales/2 as sales1, 0 as sales2),
      (date + 1, country, discount_code, 0, sales/2)
    ]) t
    group by date, country, discount_code
      
    

    If to apply to sample data from your question - output is

    enter image description here