Search code examples
concurrencyprometheus

How do I query Prometheus for the timeseries that was updated last?


I have 100 instances of a service that use one database. I want them to export a Prometheus metric with the number of rows in a specific table of this database.

To avoid hitting the database with 100 queries at the same time, I periodically elect one of the instances to do the measurement and set a Prometheus gauge to the number obtained. Different instances may be elected at different times. Thus, each of the 100 instances may have its own value of the gauge, but only one of them is “current” at any given time.

What is the best way to pick only this “current” value from the 100 gauges?

My first idea was to export two gauges from each instance: the actual measurement and its timestamp. Then perhaps I could take the max(timestamp), then and it with the actual metric. But I can’t figure out how to do this in PromQL, because max will erase the instance I could and on.

My second idea was to reset the gauge to −1 (some sentinel value) at some time after the measurement. But this looks brittle, because if I don’t synchronize everything tightly, the “current” gauge could be reset before or after the “new” one is set, causing gaps or overlaps. Similar considerations go for explicitly deleting the metric and for exporting it with an explicit timestamp (to induce staleness).


Solution

  • I figured out the first idea (not tested yet):

    avg(my_rows_count and on(instance) topk(1, my_rows_count_timestamp))
    

    avg could as well be max or min, it only serves to erase instance from the final result.