Search code examples
kubernetesprometheus-alertmanageropsgenie

Alertmanager does not load webhook_config


I want to create new receiver and route for alertmanager to send heartbeats to OpsGenie.

I tried to achieve it by defining opsgenie_config but I wasn't able to send the pings to heartbeats in OpsGenie (I am able to send alerts to OpsGenie with same api key).

Another method I found was to use webhook_config (as suggested in #444) and my manifest looks like this:

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: opsgenie-webhook
  labels:
    managedBy: team-sre
spec:
  receivers:
  - name: heartbeat
    webhookConfigs:
    - httpConfig:
        basicAuth:
          password:
            name: opsgenie-api-key
            key: address
      url: https://api.opsgenie.com/v2/heartbeats/sre-test-cluster/ping
  route:
    groupWait: 0s
    repeatInterval: 1m
    groupInterval: 1m
    matchers:
    - name: alertname
      value: Watchdog
    receiver: heartbeat

When I apply the manifest the described receiver and route are not loaded to the Alertmanager. When I check the logs there is no error recorded but also no message stating that the sidecar tried to load the new alertmanagerconfig.

Did anyone experience same problem and knows how to fix it?


Solution

  • I found the solution on github issue #3970 For basicAuth to be accepted, both username and password must be provided. Nice hack for it is to set username as : in base64 format (Og==). The manifests should be defined as follows:

    apiVersion: monitoring.coreos.com/v1alpha1
    kind: AlertmanagerConfig
    metadata:
      labels:
        managedBy: team-sre
      name: alertmanager-opsgenie-config
      namespace: monitoring
    spec:
      receivers:
      - name: deadmansswitch
        webhookConfigs:
          # url link to the specific heartbeat, replace test with heartbeat name
          - url: 'https://api.opsgenie.com/v2/heartbeats/<hearbeat-name>/ping'
            sendResolved: true
            httpConfig:
              basicAuth:
                # reference to secret containing login credentals
                password:
                  key: apiKey
                  name: opsgenie
                username:
                  key: username
                  name: opsgenie
      route:
        groupBy:
        - job
        groupInterval: 10s
        groupWait: 0s
        repeatInterval: 10s
        matchers:
          - name: alertname
            value: Watchdog
          - name: namespace
            value: monitoring
        receiver: deadmansswitch
    
    ---
    
    apiVersion: v1
    kind: Secret
    metadata:
      namespace: monitoring
      name: opsgenie
    type: Opaque
    data:
      # apiKey in encoded in base64
      apiKey: YOUR_PASSWORD
      # ':' in base 64 - fix suggested in https://github.com/prometheus-operator/prometheus-operator/issues/3970#issuecomment-888893008
      username: Og==
    

    After the manifest are applied and the alert definition matching the criteria is firing, Opsgenie is hit with heartbeat.