I have a query with the following result:
I thought I could use dense_rank
to group user segment in a time interval. But it doesn't work.
CUST_ID EVENT_ID SEGMENT_ID SEGMENT_CODE DATE_FROM DATE_TO
100 1424 21 A 2011.01.05. 13:03:12 2011.01.06. 23:40:13
100 1566 21 A 2011.01.06. 23:40:13 2011.02.28. 11:48:52
100 1580 21 A 2011.02.28. 11:48:52 2012.04.30. 2:49:53
100 1601 45 Y 2012.04.30. 2:49:53 2012.05.29. 21:12:27
100 1663 45 Y 2012.05.29. 21:12:27 2012.05.30. 11:11:23
100 1710 45 Y 2012.05.30. 11:11:23 2012.08.01. 5:14:36
100 1875 114 H 2012.08.01. 5:14:36 2012.09.01. 20:26:42
100 1880 33 F 2012.09.01. 20:26:42 2012.09.03. 9:12:23
100 1901 21 A 2012.09.03. 9:12:23 2012.09.03. 9:12:23
100 1903 21 A 2012.09.03. 9:12:23 2012.10.25. 17:25:14
100 1966 223 R 2012.10.25. 17:25:14 2013.01.01. 1:12:55
100 2011 223 R 2013.01.01. 1:12:55 3500:12:31. 23:59:59
I'd like something like this:
CUST_ID EVENT_ID SEGMENT_ID SEGMENT_CODE DATE_FROM DATE_TO DENSE_RANK
100 1424 21 A 2011.01.05. 13:03:12 2011.01.06. 23:40:13 1
100 1566 21 A 2011.01.06. 23:40:13 2011.02.28. 11:48:52 1
100 1580 21 A 2011.02.28. 11:48:52 2012.04.30. 2:49:53 1
100 1601 45 Y 2012.04.30. 2:49:53 2012.05.29. 21:12:27 2
100 1663 45 Y 2012.05.29. 21:12:27 2012.05.30. 11:11:23 2
100 1710 45 Y 2012.05.30. 11:11:23 2012.08.01. 5:14:36 2
100 1875 114 H 2012.08.01. 5:14:36 2012.09.01. 20:26:42 3
100 1880 33 F 2012.09.01. 20:26:42 2012.09.03. 9:12:23 4
100 1901 21 A 2012.09.03. 9:12:23 2012.09.03. 9:12:23 5
100 1903 21 A 2012.09.03. 9:12:23 2012.10.25. 17:25:14 5
100 1966 223 R 2012.10.25. 17:25:14 2013.01.01. 1:12:55 6
100 2011 223 R 2013.01.01. 1:12:55 3500:12:31. 23:59:59 6
Do you have any idea how can I solve this problem?
Based on your logic, dense_rank
won't do. However, you can get a similar effect with lag()
and a cumulative sum. The lag looks at the previous date value for the segment id/segment code, if they overlap, then the record is not the start of a grouping. Otherwise it is.
The cumulative sum of the "isstarts" is what you are looking for:
select t.*,
sum(case when date_from - prev_date_to < 0.00001
then 0
else 1
end) over (partition by segment_id order by SEGMENT_ID, SEGMENT_CODE
) as YourRanking
from (select t.*,
lag(date_to) over (partition by SEGMENT_ID, SEGMENT_CODE
order by date_from
) as prev_date_to
from t
) t
Because this is date/time, the query doesn't use exact equality. Instead it looks for a small difference.