I'm using prometheus to detect pauses in inbound traffic for an API endpoint. The query for this seems to be working nicely: alert if there's no change in HTTP requests for a given endpoint over a one-hour time period:
sum by(path, custom_identifier, kubernetes_namespace) (
delta(http_request_duration_seconds_count{
status_code="201",
method="POST",
path="/api/v1/path/to/endpoint"}[1h])
) == 0
What I want, though, is for this alert to resolve immediately upon resumption of traffic. As currently written, it won't resolve until traffic has been restored for the same time period - 1h
.
Is this a thing that Prometheus can do? Is there a different method of accomplishing this goal?
- alert: AlertName
expr: sum by(path, custom_identifier, kubernetes_namespace) (
delta(http_request_duration_seconds_count{
status_code="201",
method="POST",
path="/api/v1/path/to/endpoint"}[1m])) == 0 #recovers after 1min if resolved
for: 1h