I'm trying to create an alerting policy for a Gauge type metric in Google Cloud Platform that is expected to trigger when:
Using Terraform, I came up with the following definition:
resource "google_monitoring_alert_policy" "alert_policy_prometheus_metric" {
display_name = "Metric check failed"
conditions {
display_name = "Metric violation"
condition_threshold {
filter = "resource.type = \"prometheus_target\" AND resource.labels.cluster = \"${var.cluster_name}\" AND metric.type = \"prometheus.googleapis.com/elasticsearch_cluster_health_status/gauge\" AND metric.labels.color = \"green\""
evaluation_missing_data = "EVALUATION_MISSING_DATA_ACTIVE"
comparison = "COMPARISON_LT"
duration = "60s"
trigger {
count = 1
}
threshold_value = 1
}
}
conditions {
display_name = "Metric absent"
condition_absent {
duration = "120s"
filter = "resource.type = \"prometheus_target\" AND resource.labels.cluster = \"${var.cluster_name}\" AND metric.type = \"prometheus.googleapis.com/elasticsearch_cluster_health_status/gauge\" AND metric.labels.color = \"green\""
}
}
combiner = "OR"
notification_channels = [
"${var.monitoring_email_group_name}"
]
}
This does work, however it creates two separate incidents when the following happens:
This is surprising to me as I use OR
as the combiner for the conditions. Is there anything I can do to merge the two conditions into a single incident?
When the multi condition trigger is set to “Any condition is met
”(i.e., combiner value is "OR") and if condition 1 or 2 is met, only a single incident is created.
When the multi condition trigger is set to “All conditions are met
” and both conditions are met, two incidents are created. One incident for the event that got over the threshold for condition 1 . One incident for the event that got over the threshold for condition 2.
When the multi condition trigger is set to “All conditions are met
” and one condition out of two is met, there is no incident created.
I found in the following document, that it is not possible from your end to reduce the amount of alerts to 1 when there are multiple conditions. In the shared document it is mentioned that this feature is currently not available to you:“
You can't configure Cloud Monitoring to create a single incident and send a single notification when the policy contains multiple conditions.
This feature could be implemented for GCP alerting and incidents “momentarily”. Indeed there was a similar request that was raised by one GCP customer and you can track that in this issue tracker.
Note: Once take a look at “All conditions are met even for different resources for each condition
” (i.e., combiner value is "AND").