Search code examples
splunklatencysplunk-querysplunk-calculationsplunk-formula

How to Splunk search for transaction types that have a median latency above 3 seconds


I have a table that shows latency data, now i want to write a query for an alert that will alert when requests (method + uri) have a higher median than 3000ms (3s)

The query i use for that latency table is:

index=ms-app  environment=prod AND "*"
| eval uri=replace(mvindex(split('request.uri', "?"), 0), "\/\d+[-+\w]+", "/:n"), methodOverride='request.headers.X-HTTP-Method-Override'
| eval methodOverrideStr = if(isnull(methodOverride) OR methodOverride=="null", "", "(" + methodOverride + ")")
| eval request = 'request.method' + methodOverrideStr + " " + uri + " " + 'response.httpStatusCode'
| stats
min(stats.overallResponseTimeInMilliSeconds) as "Min",
avg(stats.overallResponseTimeInMilliSeconds) as avg_latency,
max(stats.overallResponseTimeInMilliSeconds) as "Max",
median(stats.overallResponseTimeInMilliSeconds) as "Median",
perc95(stats.overallResponseTimeInMilliSeconds) as "95th %",
count(request) as "# req total", count(eval('stats.overallResponseTimeInMilliSeconds' > 3000)) as "#>3s",
count(eval('stats.overallResponseTimeInMilliSeconds' > 5000)) as "#>5s",
count(eval('stats.overallResponseTimeInMilliSeconds' > 10000)) as "#>10s" by request
| eval "Avg" = round(avg_latency, 0)
| table request, "Median"

This produces a table displaying the median latencies based on method + uri For example:

  • POST /first-endpoint 1000
  • GET /second-endpoint 2000
  • DELETE /third-endpoint 1500
  • POST /fourth-endpoint 4000
  • GET /fifth-endpoint 4500

Now i am trying to create a query that will show only the method +uris that have high median latency above 3s so that i can create an alert, to alert splunk which endpoints have high latency This is what i tried:

index=ms-app  environment=prod AND "*"
| eval uri=replace(mvindex(split('request.uri', "?"), 0), "\/\d+[-+\w]+", "/:n"), methodOverride='request.headers.X-HTTP-Method-Override'
| eval methodOverrideStr = if(isnull(methodOverride) OR methodOverride=="null", "", "(" + methodOverride + ")")
| eval request = 'request.method' + methodOverrideStr + " " + uri + " " + 'response.httpStatusCode'
| stats
median(stats.overallResponseTimeInMilliSeconds) as "Median"
| table request, "Median" > 3000

Which should display this:

  • POST /fourth-endpoint 4000
  • GET /fifth-endpoint 4500

However it just shows the same results as the first query


Solution

  • Use the where command to filter events based on field values.

    ... | where Median > 3000
    | table request, Median