Search code examples
google-cloud-platformmonitoring

Alert if same value is received for more than 1 hour in Google Cloud


I created a function in google cloud that triggers every 10 minutes and sends a query to a postgresql database. It returns a value. Then I created an alert policy using the MQL query:

fetch cloud_function
| metric 'logging.googleapis.com/user/FUNCTION-NAME'
| align delta(10m)
| group_by [], [value_mean: mean(value.NAME)]
| absent_for (1h)

What I am interested in is receiving an alert when there is no value or when the value remains the same over a period longer than 1H. The query above did not send an alert even though it had the same value for over 12H. Maybe I was under the impression that absent_for accounts also for value being the same.

How can I change my query to be able to get notified if the value remains the same for a period longer than 1H?

EDIT 11/10/2023 I changed my query to:

{
    fetch cloud_function
    | metric 'logging.googleapis.com/user/FUNCTION-NAME'
    | group_by [], mean(val())
    | align next_younger(10m)
    ;
    fetch cloud_function
    | metric 'logging.googleapis.com/user/FUNCTION-NAME'
    | group_by [], mean(val())
    | align next_older(10m)
}
| outer_join 0,0
| sub
| condition gt(val(), 0)

But now the issue I am facing is that the most recent value shown is -800K and that doesn't allow the condition to trigger. The values behind the current one are correct, showing only differences (2,5,0 etc). So now even if I got 0 values on the graph, no alert was triggered.


Solution

  • I finally found the solution:

    fetch cloud_function
    | { t_0:
          metric 'logging.googleapis.com/user/FUNCTION-NAME'
          | align delta()
          | group_by 10m, [value_Collector_mean: mean(value.Height)]
    
      ; t_1:
          metric 'logging.googleapis.com/user/FUNCTION-NAME'
          | align delta()
          | group_by 20m,
              [value_BTC_current_db_max_block_height_mean:
                 mean(value.Height)] }
    | join
    | value t_0 - t_1
    | condition eq(val(),0)
    

    And I set the Retest condition for 30 min not to get it every time it is equal to 0. Hope this helps someone else.