Imagine a data set with some time gaps between the records:
datatable(t:datetime , v: int)
[
datetime(2022-01-01 07:00), 3,
datetime(2022-01-01 07:15), 2,
datetime(2022-01-01 07:30), 4,
datetime(2022-01-01 07:45), 1,
datetime(2022-01-01 08:00), 5,
// GAP!
datetime(2022-01-01 10:15), 8,
datetime(2022-01-01 10:30), 3,
datetime(2022-01-01 10:45), 2,
// ALSO GAP!
datetime(2022-01-01 11:30), 1,
]
I'm trying to find a max value for each record within previous hour, excluding the current iteration hour. To visualize it, I want to achieve something like that:
t | v | prev_hr | max_v |
---|---|---|---|
2022-01-01 07:00 | 3 | 2022-01-01 06:00 | null |
2022-01-01 07:15 | 2 | 2022-01-01 06:15 | 3 |
2022-01-01 07:30 | 4 | 2022-01-01 06:30 | 3 |
2022-01-01 07:45 | 1 | 2022-01-01 06:45 | 4 |
2022-01-01 08:00 | 5 | 2022-01-01 07:00 | 4 |
2022-01-01 10:15 | 8 | 2022-01-01 09:15 | null |
2022-01-01 10:30 | 3 | 2022-01-01 09:30 | 8 |
2022-01-01 10:45 | 2 | 2022-01-01 09:45 | 8 |
2022-01-01 11:30 | 1 | 2022-01-01 10:30 | 3 |
I've tried modifying the approach suggested in How to create a window of arbitrary size in Kusto? (so using scan() operator) but had problems applying it to the above. Also, I feel like something like time_window_rolling_avg_fl() might be useful here, but it seems complex for a simple use-case like above.
I feel like what I want to achieve is relatively simple and obvious, but I am just missing it.
Check out the suggestion below. Inspired by aggregations over a sliding window.
datatable(t:datetime , v: int)
[
datetime(2022-01-01 07:00), 3,
datetime(2022-01-01 07:15), 2,
datetime(2022-01-01 07:30), 4,
datetime(2022-01-01 07:45), 1,
datetime(2022-01-01 08:00), 5,
// GAP!
datetime(2022-01-01 10:15), 8,
datetime(2022-01-01 10:30), 3,
datetime(2022-01-01 10:45), 2,
// ALSO GAP!
datetime(2022-01-01 11:30), 1,
]
| extend bin_t = bin(t, 1m)
| extend _range = range(bin_t, bin_t+1h, 1m)
| mv-expand _range to typeof(datetime)
| as T | join kind=inner T on $left.bin_t == $right._range
| project t, t1, v, v1
| summarize max_v = maxif(v1, t1 < t) by t, v
| project t, v, prev_h = t-1h, max_v
t | v | prev_h | max_v |
---|---|---|---|
2022-01-01 07:00:00.0000000 | 3 | 2022-01-01 06:00:00.0000000 | |
2022-01-01 07:15:00.0000000 | 2 | 2022-01-01 06:15:00.0000000 | 3 |
2022-01-01 07:30:00.0000000 | 4 | 2022-01-01 06:30:00.0000000 | 3 |
2022-01-01 07:45:00.0000000 | 1 | 2022-01-01 06:45:00.0000000 | 4 |
2022-01-01 08:00:00.0000000 | 5 | 2022-01-01 07:00:00.0000000 | 4 |
2022-01-01 10:15:00.0000000 | 8 | 2022-01-01 09:15:00.0000000 | |
2022-01-01 10:30:00.0000000 | 3 | 2022-01-01 09:30:00.0000000 | 8 |
2022-01-01 10:45:00.0000000 | 2 | 2022-01-01 09:45:00.0000000 | 8 |
2022-01-01 11:30:00.0000000 | 1 | 2022-01-01 10:30:00.0000000 | 3 |