Search code examples
prometheusgrafanagrafana-variable

One graph from many metrics in one query


I have a large number of custom metrics like:

Request_GetData_XXXX_duration_seconds_sum
Request_GetData_XXXX_duration_seconds_count
Request_GetData_YYYY_duration_seconds_sum
Request_GetData_YYYY_duration_seconds_count

Their number and names (XXXX and YYYY) may change, but they are always paired and end in _sum and _count. In Grafana, I draw an Average plot in the following way:

sum by (container) (
  rate(Request_GetData_XXXX_duration_seconds_sum{}[5m]) / 
  rate(Request_GetData_XXXX_duration_seconds_count{}[5m]) )

Everything works.

I made a drop down menu and through a variable {name=~"Request_GetData_.*_seconds_count$"} And Regex /(.+)_count{/ get a list of metrics

result:

  • Request_GetData_XXXX_duration_seconds
  • Request_GetData_YYYY_duration_seconds

And in Grafana I write in the request:

sum by (container) (rate(${myVar_metrics}_sum{}[5m]) / rate(${myVar_metrics}_count{}[5m]))

Everything is working. But I need the following. I need one graph on which there will be all the metrics Request_GetData_XXXX_duration_seconds, Request_GetData_YYYY_duration_seconds and etc.

Tell me how can this be done? The Multi-value and Include All option options on a variable break such a graph.


Solution

  • First of all, consider converting you metrics to the best practices of Prometheus. I anticipate something like this:

    mysystem_request_duration_seconds_sum{type="XXXX"}
    mysystem_request_duration_seconds_count{type="XXXX"}
    mysystem_request_duration_seconds_sum{type="YYYY"}
    mysystem_request_duration_seconds_count{type="YYYY"}
    

    This way you wouldn't need all this hustle with metric names.

    In this case your query would look like this:

    sum by (container) (
    rate(mysystem_request_duration_seconds_sum{type="$myVar"}[5m]) / 
    rate(mysystem_request_duration_seconds_count{type="$myVar"}[5m]) )
    

    For your current situation you'll need to employ a couple of tricks:

    1. Since you can't query multiple metrics by name with regex, you'll need to use {__name__=~"${myVar_metrics}_sum"}.
    2. To divide metrics, you need them to have matching labels.
      You can add label to metric based on metrics name using label_replace:
    label_replace({__name__=~"${myVar_metrics}_sum"} ,"foo", "$1", "__name__", "(${myVar_metrics})_sum")
    
    1. Range selector can be applied only to a metric selector. Since label_replace is a function and not a metric selector, you'll need to use subquery syntax. Basically, the difference is addition of semicolon into range selector:
    rate(label_replace({__name__=~"${myVar_metrics}_sum"} ,"foo", "$1", "__name__", "(${myVar_metrics})_sum")[5m : ])
    

    Resulting query would look like this:

    sum by (container) (
     rate(
      label_replace(
       {__name__=~"${myVar_metrics}_sum"} ,
       "foo", "$1",
       "__name__", "(${myVar_metrics})_sum")
      [5m : ])
    /rate(
      label_replace(
       {__name__=~"${myVar_metrics}_count"} ,
       "foo", "$1", 
       "__name__", "(${myVar_metrics})_count")
      [5m : ])
    )