Search code examples
azure-application-insightsazure-data-explorerazure-analyticskql

How to access the range-step value within `toscalar()` statement used within `range()` statement


Am using a Kusto query to create a timechart within Azure AppInsights, to visualize when our webservice is within its SLO (and when it isn't) using one of Google's examples of measuring if a webservice is within its error budget:

SLI = The proportion of sufficiently fast requests, as measured from the load balancer metrics. “Sufficiently fast” is defined as < 400 ms.
SLO = 90% of requests < 400 ms

Measured as:
count of http_requests with a duration less than or equal to "0.4" seconds
divided by count of all http_requests

Assuming 10-minute inspection intervals over a 7-day window, here is my code:

let fastResponseTimeMaxMs = 400.0;
let errorBudgetThresholdForFastResponseTime = 90.0;
//
let startTime = ago(7days);
let endTime = now();
let timeStep = 10m;
//
let timeRange = range InspectionTime from startTime to endTime step timeStep;
timeRange
    | extend RespTimeMax_ms = fastResponseTimeMaxMs
    | extend ActualCount = toscalar
    (
        requests
            | where timestamp > InspectionTime - timeStep
            | where timestamp <= InspectionTime
            | where success == "True"
            | where duration <= fastResponseTimeMaxMs
            | count 
    )
    | extend TotalCount = toscalar
    (
        requests 
            | where timestamp > InspectionTime - timeStep
            | where timestamp <= InspectionTime
            | where success == "True"
            | count
    )
    | extend Percentage = round(todecimal(ActualCount * 100) / todecimal(TotalCount), 2)
    | extend ErrorBudgetMinPercent = errorBudgetThresholdForFastResponseTime
    | extend InBudget = case(Percentage >= ErrorBudgetMinPercent, 1, 0)

Sample query output of what I wish to achieve:

InspectionTime [UTC]     RespTimeMax_ms  ActualCount  TotalCount  Percentage  ErrorBudgetMinPercent  InBudget
2019-05-23T21:53:17.894  400             8,098        8,138       99.51       90                     1  
2019-05-23T22:03:17.894  400             8,197        9,184       89.14       90                     0  
2019-05-23T22:13:17.894  400             8,002        8,555       93.54       90                     1  

The error I'm getting is:

'where' operator: Failed to resolve scalar expression named 'InspectionTime'

I've tried todatetime(InspectionTime), fails with same error.

Replacing InspectionTime with other objects of type datetime gets this code to execute OK, but not with the datetime values that I want. By example, using this snippet executes OK, when used within my code sample above:

   | extend ActualCount = toscalar
    (
        requests
            | where timestamp > startTime   // instead of 'InspectionTime - timeStep'
            | where timestamp <= endTime    // instead of 'InspectionTime'
            | where duration <= fastResponseTimeMaxMs
            | count   
    )

To me it seems that using InspectionTime within toscalar(...) is the crux of this problem, since I'm able to use InspectionTime within similar queries using range(...) that don't nest it within toscalar(...).

Note: I don't want a timechart chart of request.duration, since that doesn't tell me if the count of requests above my threshold (400ms) exceed our error budget according to the formula defined above.


Solution

  • your query is invalid as you can't reference the InspectionTime column in the subquery that you're running in toscalar().

    if I understand your desired logic correctly, the following query might work or give you a different direction (if not - you may want to share a sample input dataset using the datatable operator, and specify the desired result that matches it)

    let fastResponseTimeMaxMs = 400.0;
    let errorBudgetThresholdForFastResponseTime = 90.0;
    //
    let startTime = ago(7days);
    let endTime = now();
    let timeStep = 10m;
    //
    requests
    | where timestamp > startTime and timestamp < endTime
    | where success == 'True'
    | summarize TotalCount = count(), ActualCount = countif(duration <= fastResponseTimeMaxMs) by bin(timestamp, timeStep)
    | extend Percentage = round(todecimal(ActualCount * 100) / todecimal(TotalCount), 2)
    | extend ErrorBudgetMinPercent = errorBudgetThresholdForFastResponseTime
    | extend InBudget = case(Percentage >= ErrorBudgetMinPercent, 1, 0)