Search code examples
apache-flinkprometheusprometheus-alertmanagerpromql

Flink Watermark latency with PromQL


So I want to alert when my watermark falls behind.

I want to use metrics reported by flink's job manager. Something like this, but this doesnt work as I like it.

(timestamp(flink_taskmanager_job_task_operator_currentInputWatermark{task_name=~"my_window.*"})-(4*60*60*1000))-flink_taskmanager_job_task_operator_currentInputWatermark{task_name=~"my_window.*"}

Verbally : i'd like to get a diff in currentTime (time when the metric was reported) - wmatermark ts.

(4*60*60*1000) is to convert to EDT -- is there a better way to do this ?


Solution

  • OK. so the above query was almost perfect. what I was doing wrong is shifting an already EDT timestamp to -4h. Below is the perfect query to do this:

    timestamp(flink_taskmanager_job_task_operator_currentInputWatermark{task_name="my_window",job_name="session"})*1000-flink_taskmanager_job_task_operator_currentInputWatermark{task_name="my_window",job_name="session"}
    

    the flink_taskmanager_job_task_operator_currentInputWatermark reports doesnt report in ms but timestamp does hence the *1000 conversion