Search code examples
regexfail2ban

failregex misses entries


I need to hit ›page not found‹ log entries like this one:

185.220.100.252 - - [13/May/2022:10:03:58 +0200] "GET /EXPLOIT.php HTTP/1.1" 404 14780 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"

This failregex basically works

^<HOST> -\s*- \[.*\] "GET .*" 404 \d+ "-" ".*"$

and finds 8900 out of 30k entries. I'm testing with

fail2ban-regex /var/log/apache2/scienceblog.at.access.log '^<HOST> -\s*- \[.*\] "GET .*" 404 \d+ "-" ".*"$'

And so does

^<HOST> -\s*- \[.*.*\] "GET .*" 404 \d+ "-" ".*"$

But when I try to get specific between the square brackets like in one of

^<HOST> -\s*- \[.*\d.*\] "GET .*" 404 \d+ "-" ".*"$
^<HOST> -\s*- \[.*\s.*\] "GET .*" 404 \d+ "-" ".*"$
^<HOST> -\s*- \[.* .*\] "GET .*" 404 \d+ "-" ".*"$
^<HOST> -\s*- \[\d.*\] "GET .*" 404 \d+ "-" ".*"$
^<HOST> -\s*- \[.*0200\] "GET .*" 404 \d+ "-" ".*"$
^<HOST> -\s*- \[.* .*\] "GET .*" 404 \d+ "-" ".*"$

or anything else (let alone a regex evaluating the whole date-string) the filter wouldn't find a single log entry and I can't figure out, why. I've already read, what I've found on fail2ban-regex here and elsewhere, but to no avail.


Solution

  • The failregex matches the logfile entry without the date, so for your example

    185.220.100.252 - - [13/May/2022:10:03:58 +0200] "GET /EXPLOIT.php HTTP/1.1" 404 14780 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
    

    fail2ban has extracted the date on its own

    13/May/2022:10:03:58 +0200

    and removed it from the log entry, and so is actually matching your regex against

    185.220.100.252 - - [] "GET /EXPLOIT.php HTTP/1.1" 404 14780 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
    

    so the regexes that worked for you, are working because

    \[.*\] and \[.*.*\] both match [] but the other ones only match if there's actually something between the brackets.

    imho this is not at all intuitive, since the output for "missed lines" includes the date:

    Lines: 1 lines, 0 ignored, 0 matched, 1 missed
    [processed in 0.01 sec]
    
    |- Missed line(s):
    |  185.220.100.252 - - [13/May/2022:10:03:58 +0200] "GET /EXPLOIT.php HTTP/1.1" 404 14780 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
    

    But you can verify this is the case since this will give a successful match:

    '^<HOST> -\s*- \[\] "GET .*" 404 \d+ "-" ".*"$'
    

    Further reading:

    https://dee.underscore.world/blog/fail2ban-filters/