Search code examples
google-cloud-platformgoogle-cloud-stackdriverhealth-monitoring

Reduce alert noise in GCP stackdriver


We have set up alerts in my GCP environments. Basically GCP Stackdriver will raise alerts based on certain parameters which we configured (both at infrastructure level and application level).

The issue is that we are getting too many alerts, if the problem is not resolved quickly enough. For example, if a compute engine is down, we are investigating and still we get alerts. Looking for some help to reduce alert noise so that once we acknowledge an issue, the alert frequency should be reduced till we resolve the issue (maybe once every three hours rather than sending one mail each for every 10 minutes OR after the problem is fixed).


Solution

  • Posting this as an answer for better usability.

    When the alert is triggered you will be receiving notifications every 10 minutes or so until you acknowledge the incident.

    When you do notifications will stop coming, but the incident will be kept open until you close it.

    You can also silence the incident, however it may & will close other incidents that were triggered by the same condition that triggered this one.

    You may also have a look at the alerting behavior docs since they may prove useful in such cases.