I have a simple setup: Fargate ECS cluster with ALB, running web API.
I want to monitor (and ring alarms) the number of requests timed out for my web app. The only metric close to that I found in CloudWatch is AWS/ApplicationELB
-> TargetResponseTime
But, it seems like requests that timed out from the ALB point of view are not recorded there at all.
How do you monitor ALB timeouts?
This answer is only from ALB time out requests point of view.
It is confusing because there is not a specific metric which is termed or contains timeout
.
ALB Timeout generates an HTTP 408 error code for which ALB internally increments the HTTPCode_ELB_4XX_Count
.
From the Docs
The load balancer sends the HTTP code to the client, saves the request to the access log, and increments the HTTPCode_ELB_4XX_Count or HTTPCode_ELB_5XX_Count metric.
In my view you can set up a CloudWatch alarm to monitor HTTPCode_ELB_4XX_Count
metric and initiate an action (such as sending a notification to an email address) if the metric goes outside what you consider an acceptable range.
More details about the HTTPCode_ELB_4XX_Count
-> https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-cloudwatch-metrics.html