I've got a bit of a potentially unusual requirement for a KQL query. It boils down to finding the difference of the first instance to the last instance of a value within a specified timespan.
Say we got the following schema with value
being incremented at random times:
"TableName": OdometerData,
"Schema": timestamp:datetime, value:real
Querying this is simple enough with for example:
OdometerData
| where timestamp between (ago(7d) .. now())
| summarize diff=(max(value) - min(value))
The issue is that this value
field will rollover and reset to 0 when it reaches a predetermined value. This means that the query above would not return the proper value if a rollover happened during the span of the query. Correct result would be to get a value that "continues" the value after the rollover instead of resetting to 0 and summarizing the difference.
My question is if there is some built-in functionality or perhaps a way to use a materialized view to keep a value that will continue to increment independent of the value
rollover? Or would this be better solved with pre-processing of the data before writing it into the database?
Thanks
This is indeed an interesting scenario. I don't believe there is anything built-in which would fix this entire use but since it's KQL there is usually at least one way of doing it.
Personally I'd approach this with windows functions. Namely serialise
, next
and row_cumsum
which does the incrementation you're asking for in your question. Also add in a case
statement for some logic.
An interesting possibly edge case I've commented in the code below is what happens if the value is the same across two sequential value
pairs? Do you assume it's gone full circle or assume it's not changed?
Since you also mention pre-processing before this data gets ingested that could absolutely be another sensible option too.
let RollOverPoint = 100; //When does our data rollover
let OdometerData = range timestamp from startofweek(ago(28d)) to endofday(now()) step 1h //Generate some data
| extend value = rand(RollOverPoint); //Assuming Odometer rolls over at x and flips to 0
OdometerData
| where timestamp between (ago(7d) .. now())
| order by timestamp asc
| serialize
| extend NextValue = next(value, 1)
| extend SumAmount = case (
NextValue == value, 0, //Deal with identical sequences
NextValue > value, toint(NextValue - value),
NextValue < value, toint(((RollOverPoint) - value) + NextValue),
0
)
| serialize TotalOdometer = row_cumsum(SumAmount)
| project timestamp, value, TotalOdometer
//| summarize arg_min(StartTime = timestamp, LowestValue = value), arg_max(EndTime = timestamp, TotalValue = TotalOdometer)
timestamp | value | TotalOdometer |
---|---|---|
2023-10-25T19:00:00.0000000Z | 20.0 | 80 |
2023-10-25T20:00:00.0000000Z | 0.0 | 102 |
2023-10-25T21:00:00.0000000Z | 22.0 | 156 |
2023-10-25T22:00:00.0000000Z | 76.0 | 188 |
2023-10-25T23:00:00.0000000Z | 8.0 | 249 |
2023-10-26T00:00:00.0000000Z | 69.0 | 338 |
2023-10-26T01:00:00.0000000Z | 58.0 | 372 |
2023-10-26T02:00:00.0000000Z | 92.0 | 380 |
2023-10-26T03:00:00.0000000Z | 0.0 | 435 |
2023-10-26T04:00:00.0000000Z | 55.0 | 500 |
etc | etc | etc |
Or example output with the summarize
above.
StartTime | LowestValue | EndTime | TotalValue |
---|---|---|---|
2023-10-25T20:00:00.0000000Z | 61 | 2023-11-01T19:00:00.0000000Z | 8337 |