I have a telemetry table that has list of request id's, time at which certain activity happened for that request. Here is a simplified sequence of activities done for a request:
authUser -> processInput -> calculateRate -> returnResponse
and the E2E on average takes ~3 seconds.
At times, certain requests are stuck longer in an activity which I need to figure out (output would be list of req_id, activity that took longer than a threshold). The table contains millions of records so if I sort by Req_id and do a prev()
to compare delta by Req_id fails due to sorting memory limit.
Any better ways to write this query?
Sample input:
Req_Id | Req_time | Activity |
---|---|---|
Guid1 | 2022-12-01T01:01:01 | authUser |
Guid1 | 2022-12-01T01:01:01 | processInput |
Guid2 | 2022-12-01T01:01:01 | authUser |
Guid1 | 2022-12-01T01:01:02 | calculateRate |
Guid2 | 2022-12-01T01:01:03 | processInput |
Guid3 | 2022-12-01T01:01:03 | authUser |
Guid2 | 2022-12-01T01:01:04 | calculateRate |
Guid3 | 2022-12-01T01:01:04 | processInput |
Guid2 | 2022-12-01T01:01:05 | returnResponse |
.. | .. | .. |
.. | .. | .. |
Guid3 | 2022-12-01T01:01:20 | calculateRate |
Guid3 | 2022-12-01T01:01:21 | returnResponse |
Expected output:
input
| where delta_of_activity_duration > 5 second
Req_Id | Activity | Duration(sec) |
---|---|---|
Guid3 | calculateRate | 16 |
Use the partition operator to do the calculation per Req_Id.
datatable(Req_Id:string, Req_time:datetime, Activity:string)
[
"Guid1" ,datetime(2022-12-01T01:01:01) ,"authUser"
,"Guid1" ,datetime(2022-12-01T01:01:01) ,"processInput"
,"Guid2" ,datetime(2022-12-01T01:01:01) ,"authUser"
,"Guid1" ,datetime(2022-12-01T01:01:02) ,"calculateRate"
,"Guid2" ,datetime(2022-12-01T01:01:03) ,"processInput"
,"Guid3" ,datetime(2022-12-01T01:01:03) ,"authUser"
,"Guid2" ,datetime(2022-12-01T01:01:04) ,"calculateRate"
,"Guid3" ,datetime(2022-12-01T01:01:04) ,"processInput"
,"Guid2" ,datetime(2022-12-01T01:01:05) ,"returnResponse"
,"Guid3" ,datetime(2022-12-01T01:01:20) ,"calculateRate"
,"Guid3" ,datetime(2022-12-01T01:01:21) ,"returnResponse"
]
| partition hint.strategy=shuffle by Req_Id
(
order by Req_time asc
| extend duration = (Req_time - prev(Req_time))/1s
| where duration >= 5
)
Req_Id | Req_time | Activity | duration |
---|---|---|---|
Guid3 | 2022-12-01T01:01:20Z | calculateRate | 16 |