Search code examples
sqloracle-databaselaganalytic-functionslead

How to add a flag to the rows after a specific value within a key in Oracle SQL?


I have the following data:

Key Stage CreateDate
AAF 0     01-Jan-2018
AAF 0     02-Jan-2018
AAF 0     10-Jan-2018
AAF 20    20-Jan-2018
AAF 40    20-Mar-2018
AAF 0     01-May-2018
AAF 0     10-May-2018
AAF 0     20-May-2018
AAF 30    20-Jun-2018
AAF 0     20-Jul-2018   
AAF 100   20-Jul-2018       

I am basically trying to calculate the days spent at each stage.. I am currently taking the minimum date within each stage, and find the difference between the minimum date of the next stage:

select 
key,
stage,
cast(extract (day from max(next_dt) - min(createddate)) as number) as interval_days
from
(
select 
key,
stage,
createddate
lead(createddate,1) over (partition by  key order by createddate) next_dt
from  oppstages
)
group by key,stage 

As it can be seen, sometimes, the stage progresses from 0-40, but again goes back to 0. So the above logic doesn't work correctly and I am seeing a necessity to group the 0-40 as one category, and anything after 40 as next category and so on (if the stage decreases and restarts with new lesser stage number). The below query gives me the point where probability goes down, but I am not able flag to group the rows further.

select key,
stage,
createddate, 
next_dt,
next_prob,
case when   next_prob < stage  then 1   else 0 end as valid_flag,
from 
(
select 
key,
stage,
createddate,
lead(createddate,1) over (partition by  key order by createddate) next_dt, 
coalesce(lead(stage,1) over (partition by  key order by createddate),101) next_prob, 
from oppstages
) a

I expect this output so that I could group using flag to calculate the days spent at each instance:

Key Stage CreateDate    Flag
AAF 0     01-Jan-2018   1
AAF 0     02-Jan-2018   1
AAF 0     10-Jan-2018   1
AAF 20    20-Jan-2018   1
AAF 40    20-Mar-2018   1
AAF 0     01-May-2018   2
AAF 0     10-May-2018   2
AAF 0     20-May-2018   2
AAF 30    20-Jun-2018   2
AAF 10     20-Jul-2018   3
AAF 100   20-Jul-2018   3

thanks.


Solution

  • You can try to use lag window function get the Stage previous value.

    Then use CASE WHEN check PREVAL > STAGE do increase 1.

    CREATE TABLE T(
      Key varchar(50),
      Stage int,
      CreateDate date
    );
    
    
    
    INSERT INTO T VALUES ('AAF',0,TO_DATE('01-01-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',0,TO_DATE('02-01-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',0,TO_DATE('10-01-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',20,TO_DATE('20-01-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',40,TO_DATE('20-03-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',0,TO_DATE('01-05-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',0,TO_DATE('10-05-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',0,TO_DATE('20-05-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',30,TO_DATE('20-06-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',10,TO_DATE('20-07-2018','dd-mm-yyyy'));  
    INSERT INTO T VALUES ('AAF',100,TO_DATE('20-07-2018','dd-mm-yyyy'));  
    

    Query 1:

    SELECT t1.KEY,
           t1.STAGE,
          (SUM(CASE WHEN PREVAL > STAGE THEN 1 ELSE 0 END) over (partition by Key order by CreateDate) + 1)  Flag
    FROM (
      SELECT T.*,lag(Stage) over (partition by Key  order by CreateDate) preVAL
      FROM T 
    )t1
    

    Results:

    | KEY | STAGE | FLAG |
    |-----|-------|------|
    | AAF |     0 |    1 |
    | AAF |     0 |    1 |
    | AAF |     0 |    1 |
    | AAF |    20 |    1 |
    | AAF |    40 |    1 |
    | AAF |     0 |    2 |
    | AAF |     0 |    2 |
    | AAF |     0 |    2 |
    | AAF |    30 |    2 |
    | AAF |    10 |    3 |
    | AAF |   100 |    3 |