Search code examples
docker-swarmprometheus

Daily & monthly availability percentages via Prometheus? (or monthly uptime perc)


Needed are availability figures for one or more Docker services

We use Prometheus to collect metrics in a Docker Swarm environment. Works fine!

How can we get daily and/or monthly metrics?


Solution

  • The suggestion in the other answer may be a good step forward. Thank you @vinodk.

    Asking many developers on this subject, I finally found a good solution.

    The solution is:

    • Use Prometheus as a data source of the availability measurements. Prometheus is needed anyway for regular monitoring.
    • To prevent that the monitoring data gets lost when a Docker image or node crashes, map the data volume to an external file storage, e.g. S3.
    • Use a data visualisation tool that performs a SINGLE but very powerful Prometheus API call.

    As an example, this Prometheus API call gives the daily up (or availability) time over a period of a number of days. Take care of using Z/GMT times.

    http://prometheus-server/prometheus/api/v1/query_range?query=avg_over_time(up[1h])*100&start=2019-12-15T06:00:00.000Z&end=2019-12-17T16:59:59.000Z&step=1d
    

    What about alternatives?

    1) Store the results of Prometheus in a remote database and query it seperately. It's a bit of overkill for just retrieving the availability percentages for a month.

    2) You could use the tool 'Telegraph' which has a smaller memory footprint. It can store the data in InfluxDb. I like the Prometheus solution / architecture / standard for monitoring, so this option is not the right one in my case.

    3) We use Spring Boot for our applications. Spring Boot Admin is not applicable because it only looks to the current situation and is not persisting the results.