Resolve user IP from downstream proxies

We've previously been running a single Nginx reverse proxy between the internet and our microservices with the config:

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

Which had the requests piped through with headers like:

User        -> ALB [nginx]            -> App Servers
IP: 1.2.3.4    IP: 172.31.1.1            IP: n/a
               Forwarded-For: 1.2.3.4    Real-IP: 1.2.3.4
                                         Forwarded-For: 1.2.3.4, 172.31.1.1

But now that we need to scale out the ALBs behind an Elastic LB we're finding the extra layer of proxy problematic, eg:

User        -> ELB                    -> ALB [nginx]                        -> App Servers
IP: 1.2.3.4    IP: 172.31.1.2            IP: 172.31.1.1                        IP: n/a
               Forwarded-For: 1.2.3.4    Forwarded-For: 1.2.3.4, 172.31.1.2    Real-IP: 172.31.1.2
                                                                               Forwarded-For: 1.2.3.4, 172.31.1.2

But as you can see this is currently just setting X-Real-IP: to the ELB's IP.

We need to be able to strip off the trusted proxies and send the proper User IP in the X-Real-IP header, as well as log the User IP rather than the ELB IP.

The GeoIP module has the geoip_proxy directives that define trusted proxies to ignore when determining the "true" IP, and I have to wonder if there's something similar in nginx or some other way to accomplish this?

TIA

Solution

Well the short answer is that there's not a simple config directive for this, and there's not a 100% bulletproof way to only trust certain proxies.

Let’s construct an example. My IP is 1.2.3.4, but I'm malicious and I want to pretend that I’m 5.6.7.8. I inject my own X-Forwarded-For: header via browser extension and the nginx box gets:

X-Forwarded-For: 5.6.7.8, 1.2.3.4, 172.31.1.1, 172.31.1.2

1. Simple, but flawed

map $proxy_add_x_forwarded_for $x_real_ip {
  "~^([^,]+).*" $1;
  default $remote_addr;
}

All this does is peel off the first IP address for the X-Forwarded-For: header, which is all well and good if you don’t mind users spoofing other IP addresses via header injection. With the above example this method will return the spoofed 5.6.7.8.

2. A bit complex, but acceptable for most cases

Ideally we only want to strip off the two trusted proxies and use the first untrsuted IP. For this we’re going to have to get a little creative with the regular expressions:

map $proxy_add_x_forwarded_for $x_real_ip {
  "~(?:^|, )([^,]+), (?:10\.|172\.(?:1[6-9]|2[0-9]|3[01])\.).*" $1;
  default $remote_addr;
}

About half of that regex is just dealing with the fact that the 172.16.0.0/12 network doesn't split cleanly on an octet boundary, but it does the trick. For the above example it correctly returns 1.2.3.4.

However, if an outside attacker somehow knows that this kludgey config is in use and also knows what the trusted networks are they could set the following header to get around it:

X-Forwarded-For: 5.6.7.8, 172.16.0.1

Which ends up at the proxy as:

X-Forwarded-For: 5.6.7.8, 172.16.0.1, 1.2.3.4, 172.31.1.1, 172.31.1.2

Since that regex is essentially reading left-to-right and returning the IP to the left of the first trusted IP, in this specific case it will return the spoofed 5.6.7.8 IP. However, this is quite a corner case and is acceptable for my particular use, YMMV.

Caveat: You may get an error saying “you should increase map_hash_bucket_size” which means that that you need to increase that value to accommodate that bigass regex. However, the docs on that are a bit fiddly and talk about “alignment” being important, so if you’ve not otherwise set that value somewhere else I would suggest doubling the value referenced in the message. In my case I doubled it from 64 to 128.

3. An actual proper, bulletproof solution

IMHO this requires actually parsing the header and applying real logic, so it would either need to be patched into nginx or written into a module. Essentially porting the same logic that the GeoIP module uses for geoip_proxy and geoip_proxy_recursive.

Alternatively, you could make your application proxy-aware and implement the logic there. If you know how to properly wrangle IPs and subnets it should be a cinch. Unfortunately I don't have that option available to me in this case.

Thanks: If it weren't for membear on IRC reminding me that regex capture groups are valid inside map{} blocks I'd probably still be spinning my wheels.

Shameless Plug: I originally wrote a blog post on this before I remembered that I asked here. It's mostly the same, but also has a more detailed breakdown of that big regex.