Search code examples
google-cloud-platformgoogle-cloud-pubsubgoogle-cloud-monitoring

Google Cloud Monitoring: Add an alert if Publish succeeds and subscribe fails


I want to add an alert on Google Cloud Monitoring such that, for a given topic and a subscription, I want to know if a topic is being published then subscriptions are not being acknowledged at the same or similar rate for a given time frame.

How do we achieve that using Alerts in Google Cloud Monitoring or StackDriver?

I have tried an approach where I have 2 conditions to satisfy:

  1. If publish operations > 0.016/sec for 2 minutes (meaning atleast one publish per minute)
  2. If subscribe acknowledgments < 0.001/sec for 2 minutes (If no subscribe acknowledgements happening in 2 minutes)

Then, alert.

Whats happening here is, during low load, if there are no publishes happening say for a span of 3 minutes and a publish happens, both conditions 1 and 2 are set to be true and devs are alerted about this as failure.

So, what is the right way of designing such alerts?

If my approach is close to what I want, the next questions that come to my mind are,

  1. Is there a way to say count your two minutes from the instance where the publish happens to see if acknowledgement condition is satisfying or not.
  2. Or, is there a way to make the alert to wait for 2-3 minutes to see if the incident resolves, and then send an alert to devs.
  3. Or, is there a way we can count the occurances of these conditions satisfying and then alert only if the occurances are more than 5 or 10 in a span of 15 minutes or something like that.

Sorry for the long post. But, any kind of help is appreciated.


Solution

  • In order to calculate frequency for tasks a time window of 2-3 minutes is used. So if you had 0 tasks for 2 minutes or longer this issue recurs. This is described in documentation about partial metrics. Also, there are workarounds inside this link.

    You can try creating your own custom metrics.