Search code examples
yamlkubernetes-helmslackkube-prometheus-stack

Problems with Prometheus Alertmanager sending Slack notifications


I have Prometheus (et al) deployed on my K8s cluster via the kube-prometheus-stack Helm chart. I am trying to get the Prometheus Alertmanager to send notifications to Slack. Here is the alertmanager block from my chart values.yaml:

alertmanager:
  enabled: true
  ingress:
    enabled: true
    ingressClassName: nginx
    hosts:
      - alerts.<hidden>.com
    paths:
      - /
    pathType: ImplementationSpecific
  config:
    global:
      slack_api_url: 'http://<hidden>.slack.com'
    route:
      receiver: "slack-default"
      group_by:
        - alertname
        - cluster
        - service
      group_wait: 30s
      group_interval: 1m # 5m
      repeat_interval: 1m # 3h
      routes:
        - receiver: "slack-warn-critical"
          matchers:
            - severity =~ "warning|critical"
          continue: true
    receivers:
      - name: "null"
      - name: "slack-default"
        slack_configs:
          - channel: "alerts-test"
            title: 'Title - Default'
            text: 'Sample text'
      - name: "slack-warn-critical"
        slack_configs:
          - channel: "alerts-test"
            title: 'Title - Warn/Critical'
            text: 'Sample text'

As far as I'm able to tell this YAML is valid. When I watch the pod log for the Prometheus Operator pod, it is throwing no errors. I am also able to see my settings reflected in the Alertmanager status page:

global:
  resolve_timeout: 5m
  http_config:
    follow_redirects: true
  smtp_hello: localhost
  smtp_require_tls: true
  slack_api_url: <secret>
  pagerduty_url: https://events.pagerduty.com/v2/enqueue
  opsgenie_api_url: https://api.opsgenie.com/
  wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
  victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
  telegram_api_url: https://api.telegram.org
route:
  receiver: slack-default
  group_by:
  - alertname
  - cluster
  - service
  continue: false
  routes:
  - receiver: slack-warn-critical
    matchers:
    - severity=~"warning|critical"
    continue: true
  group_wait: 30s
  group_interval: 1m
  repeat_interval: 1m
inhibit_rules:
- source_matchers:
  - severity="critical"
  target_matchers:
  - severity=~"warning|info"
  equal:
  - namespace
  - alertname
- source_matchers:
  - severity="warning"
  target_matchers:
  - severity="info"
  equal:
  - namespace
  - alertname
- source_matchers:
  - alertname="InfoInhibitor"
  target_matchers:
  - severity="info"
  equal:
  - namespace
receivers:
- name: "null"
- name: slack-default
  slack_configs:
  - send_resolved: false
    http_config:
      follow_redirects: true
    api_url: <secret>
    channel: alerts-test
    username: '{{ template "slack.default.username" . }}'
    color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
    title: Title - Default
    title_link: '{{ template "slack.default.titlelink" . }}'
    pretext: '{{ template "slack.default.pretext" . }}'
    text: Sample text
    short_fields: false
    footer: '{{ template "slack.default.footer" . }}'
    fallback: '{{ template "slack.default.fallback" . }}'
    callback_id: '{{ template "slack.default.callbackid" . }}'
    icon_emoji: '{{ template "slack.default.iconemoji" . }}'
    icon_url: '{{ template "slack.default.iconurl" . }}'
    link_names: false
- name: slack-warn-critical
  slack_configs:
  - send_resolved: false
    http_config:
      follow_redirects: true
    api_url: <secret>
    channel: alerts-test
    username: '{{ template "slack.default.username" . }}'
    color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
    title: Title - Warn/Critical
    title_link: '{{ template "slack.default.titlelink" . }}'
    pretext: '{{ template "slack.default.pretext" . }}'
    text: Sample text
    short_fields: false
    footer: '{{ template "slack.default.footer" . }}'
    fallback: '{{ template "slack.default.fallback" . }}'
    callback_id: '{{ template "slack.default.callbackid" . }}'
    icon_emoji: '{{ template "slack.default.iconemoji" . }}'
    icon_url: '{{ template "slack.default.iconurl" . }}'
    link_names: false
templates:
- /etc/alertmanager/config/*.tmpl

From what I see, there are a number of alerts firing on the cluster, but I cannot get anything to show up in Slack. My repeat interval should be sending things out every minute (for testing).

Is there something I have configured that is preventing notifications from being sent? Are the inhibit rules preventing them? Am I missing something critical? Is there perhaps something that needs to be set up on the Slack end?

Not sure if this of any help. Here's the log from the Alertmanager pod:

ts=2022-11-17T18:30:04.471Z caller=main.go:231 level=info msg="Starting Alertmanager" version="(version=0.24.0, branch=HEAD, revision=f484b17fa3c583ed1b2c8bbcec20ba1db2aa5f11)"
ts=2022-11-17T18:30:04.471Z caller=main.go:232 level=info build_context="(go=go1.17.8, user=root@265f14f5c6fc, date=20220325-09:31:33)"
ts=2022-11-17T18:30:04.497Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
ts=2022-11-17T18:30:04.498Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
ts=2022-11-17T18:30:04.500Z caller=main.go:431 level=info component=configuration msg="skipping creation of receiver not referenced by any route" receiver="null"
ts=2022-11-17T18:30:04.501Z caller=main.go:535 level=info msg=Listening address=:9093
ts=2022-11-17T18:30:04.501Z caller=tls_config.go:231 level=info msg="TLS is disabled." http2=false
ts=2022-11-17T18:30:08.816Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
ts=2022-11-17T18:30:08.816Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config_out/alertmanager.env.yaml
ts=2022-11-17T18:30:08.818Z caller=main.go:431 level=info component=configuration msg="skipping creation of receiver not referenced by any route" receiver="null"

UPDATE:

After looking at things a little more closely, obtained and tried the "webhook URL" for the slack_api_url. It is now sending notifications that are being received by Slack. I had mistakenly used the Slack URL (e.g. http://example.slack.com).


Solution

  • The issue was that I was using the Slack URL instead of the Slack webhook URL. Initially didn't know what that was, but after asking our IT, obtained it and use that instead. Notifications are now being received from Alertmanager on the designated Slack channel.