Search code examples
sqljoindatabricks-sql

SQL query for particular output


I have an input table like below

Load_date projected_date total demand rolling consumed demand
2020-01-01 2020-01-01 100 60
2020-01-01 2020-01-02 150 75
2020-01-01 2020-01-03 200 100
2020-01-01 2020-01-04 300 120
2020-01-01 2020-01-05 400 160

My output will be like this:

Load_date projected_date total demand rolling consumed demand Finished projected date
2020-01-01 2020-01-01 100 60 2020-01-03
2020-01-01 2020-01-02 150 75 2020-01-05
2020-01-01 2020-01-03 200 100 ---
2020-01-01 2020-01-04 300 120 ---
2020-01-01 2020-01-05 400 160 ---

Can you please help me with the SQL query? I have tried to write query in Databricks SQL, but not getting this output. Actually it is a small part of my entire requirement.

The logic here is to calculate column Finished projected date, whenever the rolling consumed demand will exceed the total demand for a particular projected date, it will return that rolling consumed demand's corresponding projected date.

Like in the above table, the 2020-01-01 projected date has total demand of 100. But the rolling consumed demand reaches to 100 on projected date 2020-01-03. So, for 1st row it will return the finished projected date as 2020-01-03.

Similarly for projected date 2020-01-02, the rolling consumed demand exceeds the total demand on 2020-01-05. As 160 > 150. So the finished projected date would be 2020-01-05.


Solution

  • This problem can be solved with a self-join. Self-joins are inefficient and should be avoided if possible. You should seek a solution that uses a window function instead.

    SELECT a.*, MIN(b.projected_date) as  [Finished projected date]
    FROM InputTable a
    LEFT JOIN InputTable b ON a.[total demand] <= b.[rolling consumed demand]
    GROUP BY a.Load_date, a.Projected_date, a.[total demand], a.[rolling consumed demand] 
    

    fiddle

    Load_date projected_date total demand rolling consumed demand Finished projected date
    2020-01-01 2020-01-01 100 60 2020-01-03
    2020-01-01 2020-01-02 150 75 2020-01-05
    2020-01-01 2020-01-03 200 100 null
    2020-01-01 2020-01-04 300 120 null
    2020-01-01 2020-01-05 400 160 null