Search code examples
stackdrivergoogle-cloud-stackdrivergoogle-cloud-monitoringmonitoring-query-language

Divide two metrics in Google Cloud Monitoring / MQL


How do I compute the difference and ratio between two Stackdriver metrics in MQL?

There are two parts two this question, but I am grateful if you can help me with one:

  1. Compute the difference between the two time series
  2. Compute the ratio between the two time series. Addon if possible: The case that the denominator is null should be handled gracefully.

I got so far, but it does not yield the expected result (the resulting time series is always zero):

fetch global
| { t_0:
      metric 'custom.googleapis.com/http/server/requests/count'
      | filter
        (metric.service == 'service-a' && metric.uri =~ '/api/started')
      | align next_older(1m);
    t_1:
      metric 'custom.googleapis.com/http/server/requests/count'
      | filter
        (metric.service == 'service-a' && metric.uri =~ '/api/completed')
      | align next_older(1m)
  }
| outer_join 0
| div

Obviously, the code has been anonymized. What I want to accomplish is to track whether there is a difference between processes that have been started vs completed.

EDIT / ADDITIONAL INFO 2021-11-18

I used the projects.timeSeries v3 API for further debugging. Apparently the outer_join operation assumes that the labels between the two timeseries are the same, which is not the case in my example.

Does anybody know, how to delete the labels, so I can perform the join and aggregation?

EDIT / ADDITIONAL INFO 2021-11-19

The aggregation now works as I managed to delete the labels using the map drop[...] maplet.

The challenge was indeed the labels as these labels are generated by the Spring Boot implementation of Micrometer. Since the labels are distinct between these two metrics the join operation was always empty for join, or only the second timeseries for outer_join.


Solution

  • This is what I got so far. Now the aggregation works as the labels are deleted. I will update this example when I know more.

    fetch global
    | { 
        t_0:
          metric 'custom.googleapis.com/http/server/requests/count'
          | filter
            (metric.service == 'service-a' && metric.uri =~ '/api/started')
          | every (1m)
          | map drop [resource.project_id, metric.status, metric.uri, metric.exception, metric.method, metric.service, metric.outcome]
        ; t_1:
          metric 'custom.googleapis.com/http/server/requests/count'
          | filter
            (metric.service == 'service-a' && metric.uri =~ '/api/completed')
          | every (1m)
          | map drop [resource.project_id, metric.status, metric.uri, metric.exception, metric.method, metric.service, metric.outcome]
      }
    | within   d'2021/11/18-00:00:00', d'2021/11/18-15:15:00'
    | outer_join 0
    | value val(0)-val(1)