Search code examples
prometheusrancherprometheus-alertmanager

Alertmanager configuration does not get updated when values.yml have changed


I'm trying to configure Alertmanager with Mattermost. For the whole monitoring and alerting system we're using the Helm rancher-monitoring charts. When using the default values.yml file from this version of the chart everything will be deployed successfully. After enabling the alertmanager in the values.yml and editing its configuration; the alertmanager pod will also start succesfully. But the configuration for the alertmanager still has the default values like below:

global:
  resolve_timeout: 5m
  http_config: {}
  smtp_hello: localhost
  smtp_require_tls: true
  pagerduty_url: https://events.pagerduty.com/v2/enqueue
  opsgenie_api_url: https://api.opsgenie.com/
  wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
  victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
route:
  receiver: "null"
receivers:
- name: "null"
templates: []

But I want this config:

    global:
      resolve_timeout: 5m
    route:
      group_by: ['job']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: 'mattermost-notifications'
    receivers:
    - name: 'mattermost-notifications'
      slack_configs:
      - send_resolved: true
        text: '{{ template "slack.rancher.text" . }}'
        api_url: https://*******/plugins/alertmanager/api/webhook?token=*********
    templates:
    - /etc/alertmanager/config/*.tmpl

Anybody ideas?


Solution

  • The helm chart template for the secret of alertmanager checks if the secret already exists, and if so it will not be overwritten.

    {{- if (not (lookup "v1" "Secret" (include "kube-prometheus-stack.namespace" .) $secretName)) }}
    

    So at the moment you'll have to delete the secret that's automatically created by the Helm chart. The secret is called alertmanager-monitoring-rancher-monitor-alertmanger that has the alertmanager.yaml data in it. After deleting this secret the new configuration will successfully update.

    @domruf has opened an issue on their GitHub repository, so hopefully this issue will be fixed soon.