We have a job that pulls messages off a Kafka topic. The job runs hourly and it's important that the job complete before the next hour arrives.
I'm trying to set up an alert that will tell me that the consumergroup_lag isn't going to hit zero before the next hour occurs. I'm not really sure how to do this in PromQL.
So far, I have
predict_linear(kafka_consumergroup_group_lag{topic="foo",group="bar"}[10m], 3600) > 0
but that always looks 60 minutes ahead instead of ahead until the next hour boundary. I've looked at the hour()
function and joining, but I'm not putting two and two together. I don't actually even know if what I want to do is possible in PromQL. But basically, I want to replace the 3600
in the above query with something like the little shell bit below
$(( $(date -d "$(date -d 'next hour' +'%Y-%m-%d %H:00:00%z')" +%s) - $(date +%s) ))
I know there will be issues around daylight saving time changes, but I'm not too worried about that right now.
You could try using the predict_linear
function with the time window set to the remaining seconds until the next hour. This should allow you to check if the lag would be "gone" when the next hour is reached. Something like:
predict_linear(
kafka_consumergroup_group_lag{topic="my-topic",consumergroup="group"}[10m],
3600 - (time() % 3600)
)
Using 3600 - (time() % 3600)
for the time window returns the amount of time in seconds until the next hour. This will not be 100% exact to the 00
but it should give you a good approximation.