Search code examples
javascriptregexregex-greedycapturing-group

Non-greedy capturing parenthesis


I have the string mysql://user:pw@host/db?reconnect=true and the following (incorrect) regex: /^mysql:\/\/(.+):(.+)@(.+)\/(.+)\??.*$/

These are the matches I get:

["user", "pw", "host", "db?reconnect=true"]

The only problematic match is "db?reconnect=true", which I intend to be "db"

I have tried non-greedy qualifiers for both the "?" after "db" and after the last capturing parenthesis with no success. It seems like the last capturing parenthesis is greedy no matter what. Is there even a solution for this?

Cheers!


Solution

  • All of your quantifiers are greedy; you need to add ? to make them non-greedy. In this specific case, you need to be careful, because if you don't ensure it must match the GET query separately, non-greediness will omit the b in db too. There are two decent options here:

    1. Explicitly non-greedy: /^mysql:\/\/(.+):(.+)@(.+)\/(.+?)(?:\?.*)?$/ (You need to group the ? with the rest of GET query; if it's optional by itself, the non-greedy code will stop early, ignore the optional ?, and just shove everything into the match on the greedy .*)
    2. Greedy, but excluding ? from the things it's willing to match: /^mysql:\/\/(.+):(.+)@(.+)\/([^?]+)(?:\?.*)?$/ Since a ? can't occur in a legal URL except when splitting the GET query, we swap from .+ to [^?]+ to keep everything until the ?