Search code examples
azure-data-explorerkqlmaterialized-views

Finding next record by username and timestamp in a materialized view without serializing data in Kusto


I have a table in Azure Data Explorer with a username, timestamp and status. I with to calculate the duration of each status of the same username and store the results in a materialized view.

Below example query shows the expected result, but materialized views does not support serialization of data (e.g. the sort statement).

Is there a smart way to work around this limitation in materialized views?

Example query:

let data=datatable (Username:string, Timestamp:datetime, Status:int)
[
  "Jimmy", datetime(2021-10-01 12:30:00), 1,
  "Joan", datetime(2021-10-01 12:34:56), 1,
  "Joan", datetime(2021-10-01 13:34:56), 2,
  "Jimmy", datetime(2021-10-01 14:34:56), 2,
  "Jimmy", datetime(2021-10-01 16:34:56), 1,
  "Joan", datetime(2021-10-01 18:34:34), 1,
  "Joan", datetime(2021-10-01 18:38:20), 2,
  "Joan", datetime(2021-10-01 18:45:16), 1,
];
data
| sort by Username, Timestamp asc
| extend NextUsername = next(Username)
| extend NextTimestamp = iif(Username == NextUsername, next(Timestamp), datetime(null))
| extend Duration = (NextTimestamp - Timestamp)
| project Username, Start = Timestamp, End = NextTimestamp, Status, Duration

Example result:

Username    Start                       End                         Status  Duration
Joan        2021-10-01 12:34:56.0000000 2021-10-01 13:34:56.0000000 1       01:00:00
Joan        2021-10-01 13:34:56.0000000 2021-10-01 18:34:34.0000000 2       04:59:38
Joan        2021-10-01 18:34:34.0000000 2021-10-01 18:38:20.0000000 1       00:03:46
Joan        2021-10-01 18:38:20.0000000 2021-10-01 18:45:16.0000000 2       00:06:56
Joan        2021-10-01 18:45:16.0000000                             1   
Jimmy       2021-10-01 12:30:00.0000000 2021-10-01 14:34:56.0000000 1       02:04:56
Jimmy       2021-10-01 14:34:56.0000000 2021-10-01 16:34:56.0000000 2       02:00:00
Jimmy       2021-10-01 16:34:56.0000000                             1   

Solution

  • No, materialized views support only summarization that either provide you with dedup, latest record, or timed binned aggregations. For this calculation, if you want to improve perf you should create your own scheduled process that performs this logic and stores the results in a table.