I'm trying to setup HAProxy to deny requests from a user-agent blacklist, but it's not working.
I'm doing as explained in the answer: https://serverfault.com/a/616974/415429 (that is the example given in the official HAProxy docs: http://cbonte.github.io/haproxy-dconv/1.8/configuration.html#7)
The example in the official docs is: acl valid-ua hdr(user-agent) -f exact-ua.lst -i -f generic-ua.lst test
I've created a demo to simulate the issue of the user agent not working in a condition:
defaults
timeout connect 5s
timeout client 30s
timeout server 30s
frontend frontend
mode http
bind :80
acl is_blacklisted_ua hdr(User-Agent) -f /path/to/ua-blacklist.lst
http-request deny deny_status 403 if is_blacklisted_ua
http-request deny deny_status 404
Then if I access the browser at localhost:8080
it returns the status 404
instead of 403
(haproxy is in a docker container that forwards port 8080
to 80
)
The file /path/to/ua-blacklist.lst
is just:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36
localhost:8080
in which the 1st line is my user agent (and the 2nd line is to test with the Host
header, as explained below). I can see it in the Chrome inspector, and if I capture the user-agent
header in haproxy and log it, I can see it too (it's exactly the same).
But if I change (only for testing purposes):
acl is_blacklisted_ua hdr(User-Agent) -f /path/to/ua-blacklist.lst
To:
acl is_blacklisted_ua hdr(Host) -f /path/to/ua-blacklist.lst
to use the Host
header. It then gives 403
status code (it works, because it matches with the 2nd line).
Then if I change the 2nd line localhost:8080
to localhost:8081
in the file it then gives 404
(as expected).
So, it seems the user agent header is not retrieved correctly, or can't match the provided values (I even tried to capture it to see if there's some difference, to no avail). The Host
header works, tough.
I also tried to use hdr(user-agent)
(in lowercase), as well as some combinations like hdr_sub
instead of hdr
, and the -i
option for case insensitivity, but all these attempts didn't work too. I'm sure the user-agent
value in the file is correct.
Update (2021-03-12)
I was able to make it work defining a user agent string and running curl with that user agent:
$ curl -o /dev/null -s -w "%{http_code}\n" -H "user-agent: test-ua" http://localhost:8080
403
I also tried a user agent with spaces test-ua space
and it worked too, but using the user agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36
didn't return 403
.
I'll try to dig it more to see if I can solve.
Any suggestions?
I tried to dig the problem using curl, and I saw that I achieved the desired effect of returning 403 in the cases I tested.
Then I tried to use the entire user agent and it didn't work, so I tried to use just the beginning of the user agent and included the other parts, until when I included (KHTML, like Gecko)
and it stopped working.
So I ended up discovering that commas weren't working (tested with curl and a very simple test,comma
user agent). Then I found that HAProxy handles commas in list files differently, as per:
https://discourse.haproxy.org/t/comma-in-acl-list-file/2333/2
I was able to solve it with the solution provided in the above answer, using req.fhdr: req.fhdr(User-Agent)
instead of hdr(User-Agent)
.
According to the docs:
[...] It differs from req.hdr() in that any commas present in the value are returned and are not used as delimiters. This is sometimes useful with headers such as User-Agent.
(Weird that the official example of list files uses the user-agent header with hdr instead of req.fhdr, which can be misleading, and it was, for me at least)