I've installed prometheus using helm into my kubernetes cluster as follows;
helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
prometheus prometheus 9 2021-09-07 08:54:54.262013 +0100 +01 deployed prometheus-14.6.0 2.26.0
I am trying to apply external_labels in the values.yaml to identify the time series sent to Alertmanager. I've used the prometheus docs to get what I believe to be the correct config, as below;
alertmanagerFiles:
alertmanager.yml:
global:
external_labels:
environment: 'perf'
My installation goes ok;
helm upgrade --install prometheus .
However my prometheus-server pod is crashing due to the following error;
level=error ts=2021-09-06T18:49:25.059Z caller=coordinator.go:124 component=configuration msg="Loading configuration file failed" file=/etc/config/alertmanager.yml err="yaml: unmarshal errors:\n line 2: fie │
│ ld external_labels not found in type config.plain"
Many of the answers here point to indentation issues, however I can't see what I am doing wrong.. from the Prometheus docs;
global:
# The labels to add to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
[ <labelname>: <labelvalue> ... ]
I have been scratching my head on this for a week or two - would appreciate a second pair of more experienced eyes, thank you! 🙏
I have managed to get this working.. firstly I was putting the configuration in totally the wrong place. I figured this out when looking at the github page for prometheus alertmanager, and I could not see the field defined in the 'good config test', so it must be configured elsewhere..
Indeed the prometheus config page says so - so I added a section under ## Prometheus server ConfigMap entries;
serverFiles:
prometheus.yml:
global:
external_labels:
environment: perf
This did not work either, the pod was crashing. Turns out this should be configured in the part in the values.yaml which configures the prometheus-server container itself - where the top level field = server, and we can see the default global values are also configured here. So I added external_labels into this section;
server:
global:
scrape_interval: 1m
scrape_timeout: 10s
evaluation_interval: 1m
external_labels:
environment: perf
When I upgraded using helm upgrade --install prometheus .
I can now see the correct config in kubectl get cm prometheus-server -o yaml
, plus my Pager Duty alerts are now showing the environment name in the Summary.
A little side tip on how to test alerts without having to kill pods/create OOMs etc is to create an alert expr: which constantly fires (e.g kube_pod_container_status_restarts_total > 3
) which I did by accident but proved to be quite useful.