Search code examples
prometheusgrafanapromql

Last known timestamp of a metric sample?


How do I get the timestamp of the last known sample of a metric? I want to display a table that lists when was the last time a specific service was running before it disappeared (died or whatever). The user could be looking at the table in a dashboard set with time window of say Last hour and I still want to get this timestamp displayed even if the service died 5 days ago. For the metric, I could perhaps use something like process_start_time_seconds or up, but it could be I guess any process metric as long as I get the timestamp. For example here, the last time the service existed was around 11/23. I want the table to have a column Last Seen and display that timestamp. There could be other rows in the table for other services.

enter image description here


Solution

  • It is possible with combination of max_over_time() and timestamp() functions, though it can pull quite a lot of historical data so be careful.

    max_over_time( 
      timestamp(
        my_metric{my_label="somevalue"}
      )[30d:5m] # this means "repeat for each 5 minutes in 30 days range"
    )
    

    How it works:

    1. timestamp() function returns the timestamp of a metric, instead of value.
    2. [30d:5m] notation (this is called a subquery) makes a range of values out of previous statement.
    3. max_over_time(), obviously, takes the maximum value out of a vector, which in case of a timestamp is like "last seen".

    How long it will work (if at all) after a metric disappearance is determined by the subquery notation. If a metric disappeared before the 30d interval - you won't see it. If a metric was a short-lived one and you set a step larger than 5m, you might also miss it. 5m step should be the minimum to catch all metrics, but it can pull a lot of data out of a large range so you might want to tweak these two.