Search code examples
oracleplsqldata-analysis

Finding first of consecutive dates using analytical functions


I need to analyze a piece of data stored in one Oracle database table. The example rows in this table is like this:

Department  Date    Status
------------------------------------
D1          2016/6/1    1
D1          2016/5/31   0
D1          2016/5/30   0
D1          2016/5/29   1
D1          2016/5/28   0
D1          2016/5/27   1
D2          2016/6/1    0
D2          2016/5/31   1
D2          2016/5/30   1
D2          2016/5/29   0
D2          2016/5/28   0
D2          2016/5/27   1

There is one row for each department and each date. What I want is, for each department for ANY GIVEN DATE, find the first date when the Status is 0 prior to this date. For example, for department D1, for 2016/6/1, the result is 2015/5/30, and for D1, 2016/5/31 and 5/30 it is also 2015/5/30. For D1, 2016/5/29 and 5/28, it is 2016/5/28.

I need to output the result as a SELECT result and store it in another table. So the result table has exactly the same number of rows as the original table. There is probably three columns in the result table: Department, Date, TargetDate.

I have little experience in SQL / PL SQL. I guess I should use analytical function and windowing, but I really can't come up with the query. Please give me some advice. Thank you!


Solution

  • with test_data (department, date_, status) as (
           select 'D1', date '2016-06-01', 1 from dual union all
           select 'D1', date '2016-05-31', 0 from dual union all
           select 'D1', date '2016-05-30', 0 from dual union all
           select 'D1', date '2016-05-29', 1 from dual union all
           select 'D1', date '2016-05-28', 0 from dual union all
           select 'D1', date '2016-05-27', 1 from dual union all
           select 'D2', date '2016-06-01', 0 from dual union all
           select 'D2', date '2016-05-31', 1 from dual union all
           select 'D2', date '2016-05-30', 1 from dual union all
           select 'D2', date '2016-05-29', 0 from dual union all
           select 'D2', date '2016-05-28', 0 from dual union all
           select 'D2', date '2016-05-27', 1 from dual
         ),
         t (department, date_, status, lagged_status) as (
           select department, date_, status,
                  lag(status) over (partition by department order by date_)
           from   test_data
         ) 
    select   department, date_, 
             max(case when status = 0 and (lagged_status = 1 or lagged_status is null) 
                      then date_ end)
                 over (partition by department order by date_
                       rows between unbounded preceding and current row) as target_date
    from     t
    order by department, date_ desc
    ;
    

    Result:

    DEPARTMENT  DATE_      TARGET_DATE
    ----------- ---------- -----------
    D1          2016-06-01 2016-05-30
    D1          2016-05-31 2016-05-30
    D1          2016-05-30 2016-05-30
    D1          2016-05-29 2016-05-28
    D1          2016-05-28 2016-05-28
    D1          2016-05-27 (null)
    D2          2016-06-01 2016-06-01
    D2          2016-05-31 2016-05-28
    D2          2016-05-30 2016-05-28
    D2          2016-05-29 2016-05-28
    D2          2016-05-28 2016-05-28
    D2          2016-05-27 (null)
    
    12 rows selected.
    

    Please note, "date" is an Oracle keyword and shouldn't be used as a column name; I used date_ instead (I added an underscore).