I want to build a dashboard that displays the percentage of the uptime for each month of an Elastic Beanstalk service in my company.
So I used boto3 get_metric_data to retrieve the Environment Health CloudWatch metrics data and calculate the percentage of non-severe time of my service.
from datetime import datetime
import boto3
SEVERE = 25
client = boto3.client('cloudwatch')
metric_data_queries = [
{
'Id': 'healthStatus',
'MetricStat': {
'Metric': {
'Namespace': 'AWS/ElasticBeanstalk',
'MetricName': 'EnvironmentHealth',
'Dimensions': [
{
'Name': 'EnvironmentName',
'Value': 'ServiceA'
}
]
},
'Period': 300,
'Stat': 'Maximum'
},
'Label': 'EnvironmentHealth',
'ReturnData': True
}
]
response = client.get_metric_data(
MetricDataQueries=metric_data_queries,
StartTime=datetime(2019, 9, 1),
EndTime=datetime(2019, 9, 30),
ScanBy='TimestampAscending'
)
health_data = response['MetricDataResults'][0]['Values']
total_times = len(health_data)
severe_times = health_data.count(SEVERE)
print(f'total_times: {total_times}')
print(f'severe_times: {severe_times}')
print(f'healthy percent: {1 - (severe_times/total_times)}')
Now I'm wondering how to show the percentage on the dashboard on CloudWatch. I mean I want to show something like the following:
Does anyone know how to upload the healthy percent I've calculated to the dashboard of CloudWatch?
Or is there any other tool that is more appropriate for displaying the uptime of my service?
You can do math with CloudWatch metrics: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
You can create a metric math expression with the metrics you have in metric_data_queries
and get the result on the graph. Metric math also works with GetMetricData API, so you could move the calculation you do into MetricDataQuery and get the number you need directly from CloudWatch.
Looks like you need a number saying what percentage of datapoints in the last month the metric value equaled to 25.
You can calculate it like this (this is the source of the graph, you can use in CloudWatch console on the source tab, make sure the region matches your region and the metric name matches your metric):
{
"metrics": [
[
"AWS/ElasticBeanstalk",
"EnvironmentHealth",
"EnvironmentName",
"ServiceA",
{
"label": "metric",
"id": "m1",
"visible": false,
"stat": "Maximum"
}
],
[
{
"expression": "25",
"label": "Value for severe",
"id": "severe_c",
"visible": false
}
],
[
{
"expression": "m1*0",
"label": "Constant 0 time series",
"id": "zero_ts",
"visible": false
}
],
[
{
"expression": "1-AVG(CEIL(ABS(m1-severe_c)/MAX(m1)))",
"label": "Percentage of times value equals severe",
"id": "severe_pct",
"visible": false
}
],
[
{
"expression": "(zero_ts+severe_pct)*100",
"label": "Service Uptime",
"id": "e1"
}
]
],
"view": "singleValue",
"stacked": false,
"region": "eu-west-1",
"period": 300
}
To explain what is going on there (what is the purpose of each element above, by id):
Maximum
.m1-severe_c
- sets the datapoints with value equal SEVERE to 0.ABS(m1-severe_c)
- makes all values positive, keeps SEVERE datapoints at 0.ABS(m1-severe_c)/MAX(m1)
- dividing by maximum value ensures that all values are now between 0 and 1.CEIL(ABS(m1-severe_c)/MAX(m1))
- snaps all values that are different than 0 to 1, keeps SEVERE at 0. AVG(CEIL(ABS(m1-severe_c)/MAX(m1))
- Because metric is now all 1s and 0s, with 0 meaning SEVERE, taking the average gives you the percentage of non severe datapoints.1-AVG(CEIL(ABS(m1-severe_c)/MAX(m1)))
- finally you need the percentage of severe values and since values are either severe or not sever, substracting from 1 gives you the needed number.(zero_ts+severe_pct)*100
. Not that this is the only result that you're returning, all other expressions have "visible": false
.