Search code examples
sqldatabricksstockdays

Check consecutive days in SQL


I have a dataset structured as follows which contains thousands of materials and millions of rows:

Material Date Stock Demand Supply Projected Stock Target Stock Overstock
123456 24/06/2024 60 0 0 60 24 Yes
123456 26/06/2024 0 4 0 56 24 Yes
123456 27/06/2024 0 4 0 52 24 Yes
123456 02/07/2024 0 0 10 62 24 Yes
123456 04/07/2024 0 0 10 72 24 Yes
123456 04/07/2024 0 0 1 73 24 Yes
123456 04/07/2024 0 1 0 72 24 Yes
123456 04/07/2024 0 1 0 71 24 Yes
123456 04/07/2024 0 4 0 67 24 Yes
123456 04/07/2024 0 4 0 63 24 Yes
123456 05/07/2024 0 4 0 59 24 Yes
123456 11/07/2024 0 0 11 70 24 Yes
123456 13/07/2024 0 1 0 69 24 Yes
123456 13/07/2024 0 1 0 68 24 Yes
123456 13/07/2024 0 1 0 67 24 Yes
123456 14/07/2024 0 4 0 63 24 Yes
123456 16/07/2024 0 4 0 59 24 Yes
123456 18/07/2024 0 0 11 70 24 Yes
123456 18/07/2024 0 4 0 66 24 Yes
123456 19/07/2024 0 1 0 65 24 Yes
123456 19/07/2024 0 1 0 64 24 Yes
123456 21/07/2024 0 4 0 60 24 Yes
123456 21/07/2024 0 1 0 59 24 Yes
123456 21/07/2024 0 1 0 58 24 Yes
123456 21/07/2024 0 1 0 57 24 Yes
123456 25/07/2024 0 0 9 66 24 Yes
123456 25/07/2024 0 4 0 62 24 Yes
123456 30/07/2024 0 0 9 71 24 Yes
123456 30/07/2024 0 0 9 80 24 Yes
123456 30/07/2024 0 0 9 89 24 Yes
123456 30/07/2024 0 0 9 98 24 Yes
123456 30/07/2024 0 0 9 107 24 Yes
…. …. …. …. …. …. …. ….
123457 24/06/2024 30 0 0 30 24 No
123457 26/06/2024 0 4 0 26 24 No
123457 27/06/2024 0 4 0 22 24 No
123457 02/07/2024 0 0 10 32 24 No
123457 04/07/2024 0 0 10 42 24 No
123457 04/07/2024 0 0 1 43 24 No
123457 04/07/2024 0 1 0 42 24 No
123457 04/07/2024 0 1 0 41 24 No
123457 04/07/2024 0 4 0 37 24 No
123457 04/07/2024 0 4 0 33 24 No
123457 05/07/2024 0 4 0 29 24 No
123457 11/07/2024 0 0 11 40 24 No
123457 13/07/2024 0 1 0 39 24 No
123457 13/07/2024 0 1 0 38 24 No
123457 13/07/2024 0 1 0 37 24 No
123457 14/07/2024 0 4 0 33 24 No
123457 16/07/2024 0 4 0 29 24 No
123457 18/07/2024 0 0 11 40 24 No
123457 18/07/2024 0 4 0 36 24 No
123457 19/07/2024 0 1 0 35 24 No
123457 19/07/2024 0 1 0 34 24 No
123457 21/07/2024 0 4 0 30 24 No
123457 21/07/2024 0 1 0 29 24 No
123457 21/07/2024 0 1 0 28 24 No
123457 21/07/2024 0 1 0 27 24 No
123457 25/07/2024 0 0 9 36 24 No
123457 25/07/2024 0 4 0 32 24 No
123457 30/07/2024 0 0 9 41 24 No
123457 30/07/2024 0 0 9 50 24 ?
123457 30/07/2024 0 0 9 59 24 ?
123457 30/07/2024 0 0 9 68 24 ?
123457 30/07/2024 0 0 9 77 24 ?
…. …. …. …. …. …. …. ….

Because the date column skips and repeats some dates, it is difficult for me to do what I want.

I need to calculate whether the projected stock is 200% or more for every 28 consecutive days. So in the example above, the projected stock is 200% above the target stock (24*200%=48), so there needs to be a column which checks the current row, and looks in the next 28 days (so not 28 rows) and if every day of those 28 days the projected stock is above 48, then the relevant material is flagged. For reference, my query currently uses CTE to convert the raw data into below dataset.

 Projected_stock as (
    select 
      Material,
      Date,
      Stock,
      Target_Stock,
      Demand,
      Supply,
      sum(Stock - Demand + Supply) over (partition by material order by date,  rows between unbounded preceding and current row) as Projected_Stock
    from datasource
    group by all
    order by material,date)

Thank you.

Edit: I am using databricks and what I expect to see is that for this material 123456, next to the column Target_Stock, there is a another column Overstock, which has a flag on all rows, because therea are more than 28 consecutive days where the projected stock is above 200% of the target stock (48)


Solution

  • Postgres and MySQL versions of the solution. DateAdd function may vary for you.

    PostGres fiddle

        WITH  Projected_stock as 
        (
            select 
              Material,
              Date,
              Stock,
              Target_Stock,
              Demand,
              Supply,
              sum(Stock - Demand + Supply) over (partition by material order by date,  rows between unbounded preceding and current row) as Projected_Stock
            from datasource
            group by all
            order by material,date),
        CTE  as
            (
                SELECT material, c.date, 
                CASE 
                   WHEN MIN(CASE WHEN projected_stock>48 THEN '1' ELSE '0' END)!=MAX(CASE WHEN projected_stock>48 THEN '1' ELSE '0' END) THEN '?' 
                   ELSE MIN(CASE WHEN projected_stock>48 THEN '1' ELSE '0' END) 
                END as Overstock
                FROM Projected_stock c
                WHERE c.date between  c.date AND  c.date + INTERVAL '28 day' --dateadd(DAY, 28, c.date)
                GROUP BY material, date
            ),
            SELECT p.*, c.overstock
            FROM CTE c
            LEFT JOIN Projected_stock p ON p.date=c.date and c.material=p.material
    

    MySQL fiddle

        WITH  Projected_stock as 
        (
            select 
              Material,
              Date,
              Stock,
              Target_Stock,
              Demand,
              Supply,
              sum(Stock - Demand + Supply) over (partition by material order by date,  rows between unbounded preceding and current row) as Projected_Stock
            from datasource
            group by all
            order by material,date), CTE  as
            (
                SELECT material, c.Date, 
                CASE 
                   WHEN MIN(CASE WHEN Projected>48 THEN '1' ELSE '0' END)!=MAX(CASE WHEN Projected>48 THEN '1' ELSE '0' END) THEN '?' 
                   ELSE MIN(CASE WHEN Projected>48 THEN '1' ELSE '0' END) 
                END as Overstock
                FROM Projected_stock c
                WHERE c.Date between  c.Date AND  DATE_ADD(c.Date, INTERVAL 28 DAY) 
                GROUP BY c.material, c.Date
            ),
            SELECT p.*, c.overstock
            FROM CTE c
            LEFT JOIN Projected_stock p ON p.Date=c.Date and c.material=p.material