Search code examples
amazon-web-servicesnlb

Internal NLB won't route to instance X when curl to the NLB DNS from same instance X


  • I have an internal of a network-load-balancer (NLB) (resolving to private ips)
  • An NLB listener on port 80 points to a target group. An instance 10.141.80.140 in the target group is the only one.

Problem:
When I am on the instance 10.141.80.140 and curl the DNS of NLB I get no response.
I expect the NLB to redirect to 10.141.80.140 but it doesnt happen.
The NLB DNS only doesnt redirect, when I am on the 10.141.80.140 - the redirection works from other instances in the same subnet

Details:

  • The security group around the EC2 10.141.80.140 is world open, inbound and outbound
  • When I curl the NLB DNS from another instance 10.141.80.122 in the same subnet with the same security group and other settings - NLB resolves correctly to 10.141.80.140
  • When I curl the NLB DNS from the instance, to which NLB should resolve 10.141.80.140 - NLB DOESNT resolve to 10.141.80.140
  • When I curl the instance ip 10.141.80.140 from the instance 10.141.80.140 - I get a response
  • When I curl the instance ip 10.141.80.140 from the instance 10.141.80.122 - I get a response

Question:
Is there something, what prevents NLB to resolve the request of an instance, which would route back to the instance, within the NLB listeners target group?

enter image description here


Solution

  • that is a well-know behavior that I am going to be glad to explain. Network Load Balancer introduced the source address preservation feature - the original IP addresses and source ports for the incoming connections remains unmodified. When the target answers a request, the VPC internals capture this packet and forwards it to the NLB, which will forward it to its destination.

    This behavior has a side effect: when the OS kernel detects that the egress packet has as the destination address one of the local addresses, it will forward this packet directly to the application.

    For example, given the following components:

    • We have an internal NLB and a backend instance. Both are deployed in the subnet 10.0.0.0/24.
    • The NLB has the IP 10.0.0.10 and a listener on port 80 that forwards the request to the port 8080.
    • The backend instance has the address 10.0.0.55 and has a web server listening on port 8080. It has a security group that allows all the incoming local traffic.

    • If the instance tries to establish a communication with the NLB; the flow of the communication would be the following:

      • The instance wants to telnet the NLB: it does a request for establish a TCP connection against the NLB DNS name on the port 80.
        • As it is an outgoing communication, it starts from an ephemeral port; the instance sends a SYN packet (1):
          • Source: 10.0.0.55:40000
          • Destination: 10.0.0.10:80
        • The NLB receives the packet and forwards it to the backend instance (10.0.0.55:80).
        • Due the address preservation feature, the backend instance receives a SYN packet with the following information:
          • Source: 10.0.0.55:40000
          • Destination: 10.0.0.55:80
        • The Operation system routes the packet internally (as its destination is the own machine), and here is when the issue happen:
          • The initiating socket is expecting the SYN_ACK from 10.0.0.10:80 (the NLB).
          • However, it receives the SYN_ACK from 10.0.0.55:40000 (the instance itself).
          • The OS will send several TCP_RETRANSMISSION until it times out.

    This will not happen with a public NLB, as the instance will need to do NAT in the VPC to use its public IP address to send the request to the NLB. The kernel will not internally forward the packet.

    Finally, a possible workaround is registering the backends as per their IP address, not by their Instance ID; with this method, the traffic forwarded by the NLB will contain the NLB internal IP as the source IP, disabling the "source address preservation" feature. Unfortunately, if you are launching instances with an AutoScaling Group, it will only be able to register the launched instances by its ID. In case of ECS tasks, configuring the network as "awsvpc" forces the NLB to register each target by its IP.