Search code examples
regexnginxreverse-proxynginx-config

nginx: Regex match the middle of a URL and also the ending?


I have an app on a domain that is set by a developer to proxy at certain URLs:

example.com/browser/123foo0/stuff.js

for example, where 123foo0 is some random key. The key may also change length in future.

That's all fine.

But I'd like to interrupt specific requests and not proxy them: I don't want to serve anything after the key that is in the path /welcome for example, i.e. not proxy any of these:

example.com/browser/123foo0/welcome/welcome.html
example.com/browser/foo456b/welcome/welcome.css
example.com/browser/bar123f/welcome/welcome.js
example.com/browser/456foob/welcome/other.stuff
example.com/browser/foo789b/welcome/

So I tried simple stuff first like: location ^~ /browser/.*/welcome/welcome.html {... and location ~* .*/welcome/ {... but couldn't even get that working, before moving on to try capturing groups like css files and scripts and so on.

I also tried putting regex in quotes, but that didn't seem to work either.

What am I doing wrong?

Here's a truncated version of the conf, with the location blocks only:

    location ^~ "/browser/.*/welcome/welcome.html" {
        return 200 'Not proxied.\n';
        add_header Content-Type text/plain;
    }

    location ^~ /browser {
        proxy_pass http://127.0.0.1:1234;
        proxy_set_header Host $http_host;
    }

    # landing page
    location / {
      root /var/www/foobar;
      index index.html;
      try_files $uri $uri/ /index.html;
    }

Edit

I've reviewed the documentation on how nginx selects a location, but unfortunately I didn't find it particularly clear or helpful. What am I missing?

I thought this rule in question would match and take precedence over the latter /browser rule, because of this line in the documentation:

If the longest matching prefix location has the “^~” modifier then regular expressions are not checked.

i.e. because this rule in question comes first and it is longer than the latter /browser rule, a match would occur here and not later (because processing stops here)?

But this is also confusing because I also tried ~* [pattern] instead of the priority prefix ^~ [pattern] and that didn't work either...


Solution

  • You have:

    location ^~ "/browser/.*/welcome/welcome.html" { ... }
    location ^~ /browser { ... }
    

    which is wrong in two ways:

    1. /browser/.*/welcome/welcome.html is a regular expression (because of the .*), which means it can only appear in location statements with a ~ or ~* operators.

    2. location ^~ /browser will allow a longer matching prefix location, but you cannot use a prefix location because of the .*.


    A better solution would be:

    location ~ ^/browser/[^/]+/welcome/welcome\.html$ { ... }
    location /browser { ... }
    

    By using the corrected regular expression statement, and removing the ^~ operator from the prefix location.