I have the following dataset:
ukey id code create_date
1 1082 9053 2018-03-01 23:18:51.0000000
2 1082 9035 2018-03-01 23:19:21.0000000
3 1082 9053 2018-03-01 23:22:55.0000000
4 1082 9535 2018-03-01 23:23:30.0000000
5 1196 3145 2018-03-05 07:27:15.0000000
6 1196 3162 2018-03-05 07:27:50.0000000
7 1196 3175 2018-03-05 07:28:24.0000000
8 1196 3235 2018-03-05 07:28:57.0000000
9 1196 3295 2018-03-05 07:29:31.0000000
10 1196 3448 2018-03-05 07:30:04.0000000
11 1196 3465 2018-03-05 07:30:37.0000000
12 1196 4265 2018-03-05 07:31:09.0000000
13 1196 3495 2018-03-05 17:13:13.0000000
14 473 551 2018-03-02 16:43:52.0000000
15 473 590 2018-03-02 16:44:30.0000000
16 473 835 2018-03-02 16:45:02.0000000
I want another column in this dataset, say RN
. RN
column should have the same value until these two conditions are met:
id
is different or,datediff
between current and previous row is more than 60 secondsSo the final dataset should look like this:
ukey id code create_date RN
1 1082 9053 2018-03-01 23:18:51.0000000 1
2 1082 9035 2018-03-01 23:19:21.0000000 1
3 1082 9053 2018-03-01 23:22:55.0000000 1
4 1082 9535 2018-03-01 23:23:30.0000000 1
5 1196 3145 2018-03-05 07:27:15.0000000 2
6 1196 3162 2018-03-05 07:27:50.0000000 2
7 1196 3175 2018-03-05 07:28:24.0000000 2
8 1196 3235 2018-03-05 07:28:57.0000000 2
9 1196 3295 2018-03-05 07:29:31.0000000 2
10 1196 3448 2018-03-05 07:30:04.0000000 2
11 1196 3465 2018-03-05 07:30:37.0000000 2
12 1196 4265 2018-03-05 07:31:09.0000000 2
13 1196 3495 2018-03-05 17:13:13.0000000 3
14 473 551 2018-03-02 16:43:52.0000000 4
15 473 590 2018-03-02 16:44:30.0000000 4
16 473 835 2018-03-02 16:45:02.0000000 4
Notice the change in RN
from 2 to 3. The id
is same but the datediff
is more than 60 secs.
So far I have tried this query. My original dataset is in dbo.myTable
:
select
a.id as A_ID,
a.code as A_Code,
a.create_date as A_CreateDate,
a.RN as A_RN,
_.id as _ID,
_.code as _Code,
_.create_date as _CreateDate,
_.RN as _RN
from myTable a
cross apply
(
select * from myTable b
where a.id=b.id and (a.ukey=b.ukey-1 or a.ukey=1) and datediff(ss,a.create_date,b.create_date)<60
)_
order by a.id, a.create_date
This query is NOT giving me the expected result. My thought was to compare each row with the previous row and check for the two conditions.
This sounds like you want lag()
and a cumulative sum:
select t.*,
sum(case when prev_create_date > dateadd(second, -60, create_date)
then 0 else 1
end) over
(order by ukey) as rn
from (select t.*,
lag(create_date) over (partition by id order by create_date) as prev_create_date
from t
) t;