Search code examples
amazon-cloudwatchamazon-cloudwatchlogsamazon-cloudwatch-metrics

AWS Cloudwatch alarm set to NonBreaching (or notBreaching) is not triggering, based on a log filter


With the following Metric and Alarm combination

  • Metric
    • Comes from a Cloudwatch log filter (when a match is found on the log)
    • Metric value: "1"
    • Default value: None
    • Unit: Count
  • Alarm
    • Statistic: Sum
    • Period: 1 minute
    • Treat missing data as: notBreaching
    • Threshold: [Metric] > 0 for 1 datapoints within 1 minute

The alarm goes to: State changed to OK at 2018/12/17.

Reason: Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as [NonBreaching].

And then it doesn't trigger, even though I force the metric > 0

Why is the alarm stuck in OK? How can the alarm become triggered again?


Solution

  • Solution

    Remove the "Unit" property from the stack template Alarm config.

    The source of the problem was actually the "Unit" property. This being set to "Count" actually made the alarm become stuck :(

    Ensure the stack is producing the same result as a manual alarm setup by checking with the describe-alarms API.