Search code examples
haproxy

HAProxy ACL not matching user-agent in file


I'm trying to setup HAProxy to deny requests from a user-agent blacklist, but it's not working.

I'm doing as explained in the answer: https://serverfault.com/a/616974/415429 (that is the example given in the official HAProxy docs: http://cbonte.github.io/haproxy-dconv/1.8/configuration.html#7)

The example in the official docs is: acl valid-ua hdr(user-agent) -f exact-ua.lst -i -f generic-ua.lst test

I've created a demo to simulate the issue of the user agent not working in a condition:

defaults
    timeout connect 5s
    timeout client 30s
    timeout server 30s

frontend frontend
    mode http
    bind :80

    acl  is_blacklisted_ua  hdr(User-Agent) -f /path/to/ua-blacklist.lst
    http-request deny deny_status 403 if is_blacklisted_ua

    http-request deny deny_status 404

Then if I access the browser at localhost:8080 it returns the status 404 instead of 403 (haproxy is in a docker container that forwards port 8080 to 80)

The file /path/to/ua-blacklist.lst is just:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36
localhost:8080

in which the 1st line is my user agent (and the 2nd line is to test with the Host header, as explained below). I can see it in the Chrome inspector, and if I capture the user-agent header in haproxy and log it, I can see it too (it's exactly the same).

But if I change (only for testing purposes):

acl  is_blacklisted_ua  hdr(User-Agent) -f /path/to/ua-blacklist.lst

To:

acl  is_blacklisted_ua  hdr(Host) -f /path/to/ua-blacklist.lst

to use the Host header. It then gives 403 status code (it works, because it matches with the 2nd line).

Then if I change the 2nd line localhost:8080 to localhost:8081 in the file it then gives 404 (as expected).

So, it seems the user agent header is not retrieved correctly, or can't match the provided values (I even tried to capture it to see if there's some difference, to no avail). The Host header works, tough.

I also tried to use hdr(user-agent) (in lowercase), as well as some combinations like hdr_sub instead of hdr, and the -i option for case insensitivity, but all these attempts didn't work too. I'm sure the user-agent value in the file is correct.

Update (2021-03-12)

I was able to make it work defining a user agent string and running curl with that user agent:

$ curl -o /dev/null -s -w "%{http_code}\n" -H "user-agent: test-ua" http://localhost:8080
403

I also tried a user agent with spaces test-ua space and it worked too, but using the user agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36 didn't return 403.

I'll try to dig it more to see if I can solve.

Any suggestions?


Solution

  • I tried to dig the problem using curl, and I saw that I achieved the desired effect of returning 403 in the cases I tested.

    Then I tried to use the entire user agent and it didn't work, so I tried to use just the beginning of the user agent and included the other parts, until when I included (KHTML, like Gecko) and it stopped working.

    So I ended up discovering that commas weren't working (tested with curl and a very simple test,comma user agent). Then I found that HAProxy handles commas in list files differently, as per:

    https://discourse.haproxy.org/t/comma-in-acl-list-file/2333/2

    I was able to solve it with the solution provided in the above answer, using req.fhdr: req.fhdr(User-Agent) instead of hdr(User-Agent).

    According to the docs:

    [...] It differs from req.hdr() in that any commas present in the value are returned and are not used as delimiters. This is sometimes useful with headers such as User-Agent.

    (Weird that the official example of list files uses the user-agent header with hdr instead of req.fhdr, which can be misleading, and it was, for me at least)