Search code examples
sqloracle-databaseanalytic-functions

Ranking base on two dates in Oracle


I have a query with the following result:

I thought I could use dense_rank to group user segment in a time interval. But it doesn't work.

CUST_ID EVENT_ID SEGMENT_ID SEGMENT_CODE       DATE_FROM              DATE_TO
100      1424      21            A         2011.01.05. 13:03:12   2011.01.06. 23:40:13
100      1566      21            A         2011.01.06. 23:40:13   2011.02.28. 11:48:52
100      1580      21            A         2011.02.28. 11:48:52   2012.04.30. 2:49:53
100      1601      45            Y         2012.04.30. 2:49:53    2012.05.29. 21:12:27
100      1663      45            Y         2012.05.29. 21:12:27   2012.05.30. 11:11:23
100      1710      45            Y         2012.05.30. 11:11:23   2012.08.01. 5:14:36
100      1875      114           H         2012.08.01. 5:14:36    2012.09.01. 20:26:42
100      1880      33            F         2012.09.01. 20:26:42   2012.09.03. 9:12:23
100      1901      21            A         2012.09.03. 9:12:23    2012.09.03. 9:12:23
100      1903      21            A         2012.09.03. 9:12:23    2012.10.25. 17:25:14
100      1966      223           R         2012.10.25. 17:25:14   2013.01.01. 1:12:55
100      2011      223           R         2013.01.01. 1:12:55    3500:12:31. 23:59:59

I'd like something like this:

CUST_ID EVENT_ID SEGMENT_ID SEGMENT_CODE       DATE_FROM              DATE_TO           DENSE_RANK
100      1424      21            A         2011.01.05. 13:03:12   2011.01.06. 23:40:13       1
100      1566      21            A         2011.01.06. 23:40:13   2011.02.28. 11:48:52       1
100      1580      21            A         2011.02.28. 11:48:52   2012.04.30. 2:49:53        1
100      1601      45            Y         2012.04.30. 2:49:53    2012.05.29. 21:12:27       2
100      1663      45            Y         2012.05.29. 21:12:27   2012.05.30. 11:11:23       2
100      1710      45            Y         2012.05.30. 11:11:23   2012.08.01. 5:14:36        2
100      1875      114           H         2012.08.01. 5:14:36    2012.09.01. 20:26:42       3
100      1880      33            F         2012.09.01. 20:26:42   2012.09.03. 9:12:23        4
100      1901      21            A         2012.09.03. 9:12:23    2012.09.03. 9:12:23        5
100      1903      21            A         2012.09.03. 9:12:23    2012.10.25. 17:25:14       5
100      1966      223           R         2012.10.25. 17:25:14   2013.01.01. 1:12:55        6
100      2011      223           R         2013.01.01. 1:12:55    3500:12:31. 23:59:59       6

Do you have any idea how can I solve this problem?


Solution

  • Based on your logic, dense_rank won't do. However, you can get a similar effect with lag() and a cumulative sum. The lag looks at the previous date value for the segment id/segment code, if they overlap, then the record is not the start of a grouping. Otherwise it is.

    The cumulative sum of the "isstarts" is what you are looking for:

    select t.*,
           sum(case when date_from - prev_date_to < 0.00001
                    then 0
                    else 1
               end) over (partition by segment_id order by SEGMENT_ID, SEGMENT_CODE
                         ) as YourRanking
    from (select t.*,
                 lag(date_to) over (partition by SEGMENT_ID, SEGMENT_CODE
                                    order by date_from
                                   ) as prev_date_to
          from t
         ) t
    

    Because this is date/time, the query doesn't use exact equality. Instead it looks for a small difference.