Search code examples
regexluaopenresty

Strange behaviour of handling handling dot character by ngx.re.match openresty module


I have a problem with understanding escaping "special" characters in Lua. I read that dot can be escaped by putting % before character. The strange thing happens when I am trying to use openresty and lua nginx module, that matches regular expressions ngx.re.match

local path = /offers/xyz:req:test0030-10-title:co/test

local regex = ^/offers/([0-9a-zA-Z_:%-%.]+)/test$"

local matches = ngx.re.match(path, regex) => returns nil

I don't understand why this works correctly when I move %. before %-.

Can anyone help me understand that?


Solution

  • NGINX uses a PCRE regex library while Lua uses very limited Lua patterns.

    In your case, %-% creates a range between % and % (i.e. %-% matches a %) and that means the [0-9a-zA-Z_:%-%.]+ pattern does not match a hyphen at all.

    You need

    ^/offers/([0-9a-zA-Z_:.-]+)/test$
                          ^^
    

    At the end of the character class, - denotes a literal - char. You do not need to escape a . inside a character class, never.

    You may also test your PCRE patterns at regex101.com.