Search code examples
google-cloud-platformalertmessagestackdrivergoogle-cloud-stackdriver

Can someone provide definition of keys in Google Stackdriver JSON alert message (sent via webhook to alerta)?


We need a reliable way to get the gcp project id of a monitored resource (not stackdriver project id) out of a stackdriver alert message (sent to alerta webhook).

One possible key i found might be:

"rawData": {
    "incident": {
      "resource_name": "squad-spielwiese test-ezander",
}   }

I wanted to be 100% safe about the syntax of key resource_name (seems like resource's project name <blanc> resource name) or if theres a better option for my case so i tried to find any documentation about stackdriver alert message keys - without success.

I can't believe nobody documented the exact definitions of what the single keys in a stackdriver webhook alert message stand for. Could someone please help me find more detailed information?

Here's a full example in alerta:

{
  "attributes": {
    "incidentId": "0.lbon2fb3ylfn",
    "ip": "66.249.87.155",
    "isOutOfHours": false,
    "moreInfo": "<a href=\"https://app.google.stackdriver.com/incidents/0.lbon2fb3ylfn?project=squad-spielwiese\" target=\"_blank\">Stackdriver Console</a>",
    "resourceId": "",
    "runBookUrl": "http://www.example.com/wiki/RunBook/GCE-VM-Instance---CPU-utilization-for-4427379606643423981"
  },
  "correlate": [],
  "createTime": "2019-08-29T12:25:54.000Z",
  "customer": null,
  "duplicateCount": 0,
  "environment": "Production",
  "event": "GCE VM Instance - CPU utilization for 4427379606643423981",
  "group": "Cloud",
  "history": [
    {
      "event": "GCE VM Instance - CPU utilization for 4427379606643423981",
      "href": "https://127.0.0.1:9090/alerta-dev/api/alert/c5ae3e1c-53ad-42a3-b053-3321b3e96123",
      "id": "c5ae3e1c-53ad-42a3-b053-3321b3e96123",
      "severity": "ok",
      "status": null,
      "text": "OK: CPU utilization for squad-spielwiese test-ezander with metric labels {instance_name=test-ezander} returned to normal with a value of 0.001.",
      "type": "severity",
      "updateTime": "2019-08-29T12:25:50.000Z",
      "value": "--"
    },
    {
      "event": "GCE VM Instance - CPU utilization for 4427379606643423981",
      "href": "https://127.0.0.1:9090/alerta-dev/api/alert/c5ae3e1c-53ad-42a3-b053-3321b3e96123",
      "id": "c5ae3e1c-53ad-42a3-b053-3321b3e96123",
      "severity": null,
      "status": "closed",
      "text": "new alert status change",
      "type": "status",
      "updateTime": "2019-08-29T12:25:50.000Z",
      "value": null
    },
    {
      "event": "GCE VM Instance - CPU utilization for 4427379606643423981",
      "href": "https://127.0.0.1:9090/alerta-dev/api/alert/312fa29e-7c76-4e24-85f7-f3ecd09cc2f6",
      "id": "312fa29e-7c76-4e24-85f7-f3ecd09cc2f6",
      "severity": "critical",
      "status": null,
      "text": "CRITICAL: CPU utilization for squad-spielwiese test-ezander with metric labels {instance_name=test-ezander} is above the threshold of 0.2 with a value of 1.000.",
      "type": "severity",
      "updateTime": "2019-08-29T12:25:54.000Z",
      "value": "--"
    },
    {
      "event": "GCE VM Instance - CPU utilization for 4427379606643423981",
      "href": "https://127.0.0.1:9090/alerta-dev/api/alert/312fa29e-7c76-4e24-85f7-f3ecd09cc2f6",
      "id": "312fa29e-7c76-4e24-85f7-f3ecd09cc2f6",
      "severity": null,
      "status": "open",
      "text": "correlated alert status change",
      "type": "status",
      "updateTime": "2019-08-29T12:25:54.000Z",
      "value": null
    }
  ],
  "href": "https://127.0.0.1:9090/alerta-dev/api/alert/c5ae3e1c-53ad-42a3-b053-3321b3e96123",
  "id": "c5ae3e1c-53ad-42a3-b053-3321b3e96123",
  "lastReceiveId": "312fa29e-7c76-4e24-85f7-f3ecd09cc2f6",
  "lastReceiveTime": "2019-08-29T12:29:50.947Z",
  "origin": "Stackdriver",
  "previousSeverity": "ok",
  "rawData": {
    "incident": {
      "condition_name": "GCE VM Instance - CPU utilization for 4427379606643423981",
      "documentation": {
        "content": "BLA ",
        "mime_type": "text/markdown"
      },
      "ended_at": null,
      "incident_id": "0.lbon2fb3ylfn",
      "policy_name": "ez-policy",
      "resource": {
        "labels": {
          "instance_id": "3269982608675192580",
          "zone": "europe-west3-c"
        },
        "type": "gce_instance"
      },
      "resource_id": "",
      "resource_name": "squad-spielwiese test-ezander",
      "started_at": 1567081554,
      "state": "open",
      "summary": "CPU utilization for squad-spielwiese test-ezander with metric labels {instance_name=test-ezander} is above the threshold of 0.2 with a value of 1.000.",
      "url": "https://app.google.stackdriver.com/incidents/0.lbon2fb3ylfn?project=squad-spielwiese"
    },
    "version": "1.2"
  },
  "receiveTime": "2019-08-29T12:29:50.947Z",
  "repeat": false,
  "resource": "squad-spielwiese test-ezander",
  "service": [
    "ez-policy"
  ],
  "severity": "critical",
  "status": "open",
  "tags": [],
  "text": "CRITICAL: CPU utilization for squad-spielwiese test-ezander with metric labels {instance_name=test-ezander} is above the threshold of 0.2 with a value of 1.000.",
  "timeout": 86400,
  "trendIndication": "moreSevere",
  "type": "stackdriverAlert",
  "value": "--"
}

Solution

  • There is no official documents about what every stackdriver response key's are, but in your example you are able to identify the VM instance and the project ID by reading carefully the response strings.

    e.g. ("resource": "squad-spielwiese test-ezander") Where "squad-spielwiese" stands for your project ID and "test-ezander" stands for your VM instance name. Also you can take a look to the URL key where it always will display your project ID: "url": "https://app.google.stackdriver.com/incidents/0.lbon2fb3ylfn?project=squad-spielwiese"

    In stackdriver there is no natural way to retrieve a key value with just project id due "secure" reasons, but.. you can workaround a call by using this API method and implement it into your code.