Search code examples
sqloracle-databaseanalytic-functions

Oracle Analytic functions - resetting a windowing clause


I have the following data set.

create table t1 (
  dept number,
  date1 date
);

Table created.

insert into t1 values (100, '01-jan-2013');
insert into t1 values (100, '02-jan-2013');
insert into t1 values (200, '03-jan-2013');
insert into t1 values (100, '04-jan-2013');
commit;

MY goal is to create a rank column that resets each time the department changes. The closest column that I can use for "partition by" clause is dept, but that won't give me the desired result.

SQL> select * from t1;

      DEPT DATE1
---------- ---------
       100 01-JAN-13
       100 02-JAN-13
       200 03-JAN-13
       100 04-JAN-13

select dept,  
       date1,
       rank () Over (partition by dept order by date1) rnk
from t1
order by date1;

      DEPT DATE1            RNK
---------- --------- ----------
       100 01-JAN-13          1
       100 02-JAN-13          2
       200 03-JAN-13          1
       100 04-JAN-13          3

The desired output is as follows. The last rnk=1 is becuase the Jan-04 record is the first record after the change.

      DEPT DATE1            RNK
---------- --------- ----------
       100 01-JAN-13          1
       100 02-JAN-13          2
       200 03-JAN-13          1
       100 04-JAN-13          1  <<<----------

Any pointers?


Solution

  • This is a little complicated. Instead of using rank() or the like, use lag() to see when something changes. Then do a cumulative sum of the flag.

    select dept, date1,
           CASE WHEN StartFlag = 0 THEN 1
                ELSE 1+StartFlag+NVL(lag(StartFlag) over (order by date1),0)
           END as rnk
    from (select t1.*,
                 (case when dept = lag(dept) over (order by date1)
                       then 1
                       else 0
                  end) as StartFlag
          from t1
         ) t1
    order by date1;
    

    Here is the SQLFiddle.

    EDIT:

    This is Gordon editing my own answer. Oops. The original query was 90% of the way there. It identified the groups where the numbers should increase, but did not assign the numbers within the groups. I would do this with another level of row_number() as in:

    select dept, date1,
           row_number() over (partition by dept, grp order by date1) as rnk
    from (select dept, date1, startflag,
                 sum(StartFlag) over (partition by dept order by date1) as grp
          from (select t1.*,
                       (case when dept = lag(dept) over (order by date1)
                             then 0
                             else 1
                        end) as StartFlag
                from t1
               ) t1
         ) t1
    order by date1;
    

    So, the overall idea is the following. First use lag() to determine where a group begins (that is, where there is a department change from one date to the next). Then, assign a "group id" to these, by doing a cumulative sum. These are the records that are to be enumerated. The final step is to enumerate them using row_number().