kubernetes prometheus high-availability prometheus-operator kube-prometheus-stack

How does kube-prometheus-stack implement HA with replicas?

If we configure kube-prometheus-stack with replicas = 2, then two instances of Prometheus get running. However no Thanos Query is deployed for aggregation. Then how is the aggregation of data with deduplication is provided for these two instances by kube-prometheus-stack?

Solution

There is no aggregation or state sharing - what you are getting is basically two independent prometheuses, which are scrapping same targets. See https://prometheus-operator.dev/docs/operator/high-availability/

The Prometheus instances scrape the same targets and evaluate the same rules, hence they will have the same data in memory and on disk, with a slight twist that given their different external label, the scrapes and evaluations won’t happen at exactly the same time. As a consequence, query requests executed against each Prometheus instance may return slightly different results.

Deduplication of alerts is happening on AlertManager side. So having two or more prometheuses provides only high availability, and not scaling.

Running multiple Prometheus instances avoids having a single point of failure but it doesn’t help scaling out Prometheus in case a single Prometheus instance can’t handle all the targets and rules.

As for using Grafana or any other visualization tool - you should configure sticky sessions, to query single instance at a time

For dashboarding, sticky sessions (using sessionAffinity on the Kubernetes Service) should be used, to get consistent graphs when refreshing or you can use something like Thanos Querier to federate the data.