Search code examples
google-cloud-platformgoogle-cloud-functionsgoogle-cloud-logginggoogle-cloud-monitoring

GCP Alert Filters Don't Affect Open Incidents


I have an alert that I have configured to send email when the sum of executions of cloud functions that have finished in status other than 'error' or 'ok' is above 0 (grouped by the function name).

The way I defined the alert is:

alert definition

And the secondary aggregator is delta.

The problem is that once the alert is open, it looks like the filters don't matter any more, and the alert stays open because it sees that the cloud function is triggered and finishes with any status (even 'ok' status keeps it open as long as its triggered enough).

ATM the only solution I can think of is to define a log based metric that will count it itself and then the alert will be based on that custom metric instead of on the built in one.

Is there something that I'm missing?

Edit:

Adding another image to show what I think might be the problem: incident

From the image above we see that the graph wont go down to 0 but will stay at 1, which is not the way other normal incidents work


Solution

  • According to the official documentation:

    "Monitoring automatically closes an incident when it observes that the condition is no longer met or when 7 days have passed without an observation that the condition is still being met."

    That made me think that there are times where the condition is not relevant to make it close the incident. Which is confirmed here:

    "If measurements are missing (for example, if there are no HTTP requests for a couple of minutes), the policy uses the last recorded value to evaluate conditions."

    The lack of HTTP requests aren't a reason to close the metric as it keeps using the last recorded value (that triggered the metric).

    So, using alerts for Http Requests is fine but you need to close them by yourself. Although I think it would be better to use a custom metric instead if you want them to be disabled automatically.