Search code examples
regexregex-group

Why does the rule stop working when expanding a character classes?


I have test URLs:

http://host.com/a/123/321/123
http://host.com/A/12z3/3G21
http://host.com/a/123
http://host.com/A_B/12z3/3G21
http://host.com/A_B1/12z3/3G21
http://host.com/A-B1/12z3/3G21

And rule for parse this:

/^[a-z]+:\/\/(?<host>.+)\/(?<uid>[a-z]+)(\/(?<var1>\w+))(\/(?<var2>\w+))?(\/(?<var3>\w+))?\/?$/i

Now the rule parses correctly from the first to the third URLs:

enter image description here

But when I have extend (?<uid>[a-z]+) with more classes (?<uid>[a-z0-9-_]+) I miss the last two captured group on URLs with underscore and dash:

enter image description here

What's wrong with my rule?


Solution

  • The problem you're getting is that the host is taking in the UID in its match. So what you can do is have it stop when it reaches the first forward slash, and don't let it go further than that.

    That will allow (?<uid>[a-z0-9-_]+) to work as you want it to.

    For the host, keep this: (?<host>.[^\/]+)

    You can look at it here