Search code examples
amazon-web-servicesaws-application-load-balancer

Which metrics should I use for an alarm HTTPCode_Target_5XX_Count or HTTPCode_ELB_5XX_Count?


I have an ALB with a single target group (this is an istio-ingress gateway), I want to capture a scenario where any request routed to this target group returns 5XX code.

Per docs:

HTTPCode_ELB_5XX_Count:

The number of HTTP 5XX server error codes that originate from the load balancer. This count does not include any response codes generated by the targets.

HTTPCode_Target_5XX_Count:

The number of HTTP response codes generated by the targets. This does not include any response codes generated by the load balancer.

I thought since this is a single target group they should be the same, but clearly they are not since for a particular time frame I see some data for HTTP 4XX but none for ELB 4XX. What's the difference? Which one should I use?


Solution

  • I think a diagram would be helpful to explain the difference. After a user sends a request to your backend, this is what your backend would do to send a response back to the user:

    Targets (e.g. EC2) -(1)-> ALB -(2)-> user
    
    1. HTTPCode_Target_5XX_Count measures the number of 5XX responses generated by the targets only

    2. HTTPCode_ELB_5XX_Count measures the number of 5XX responses originating from the load balancer only

    Note: HTTPCode_ELB_5XX_Count does not include any response codes generated by the targets, and HTTPCode_Target_5XX_Count does not include response codes originating from the load balancers. [source]

    Note: The Target_5XX is always included in the ELB_5XX since the ALB forwards the error to the client. You can find more details about an ELB_5XX which is not a Target_5XX here.

    Thanks Omar Kacimi for the correction!