Search code examples
regexgroovycapture-group

What is the purpose of non-capture groups


I was reading the Groovy tutorial and they talk about how you can create non-capturing groups by leading the group off with ?:. This way the group will not come up on the matcher.

What I don't understand is why you would want to explicitly say do not match this group. Wouldn't it be simpler to just not put it into a group?


Solution

  • ?: is used to group but when you do not want to capture them. This is useful for brevity of code and sometimes out rightly necessary. This helps in not storing something that we don't need subsequently after matching thus saving space.

    They are also used mostly in conjunction with | operator.

    The alternation operator has the lowest precedence of all regex operators. That is, it tells the regex engine to match either everything to the left of the vertical bar, or everything to the right of the vertical bar. If you want to limit the reach of the alternation, you need to use parentheses for grouping. (http://www.regular-expressions.info/alternation.html).

    In this case, you cannot just leave them without putting them in a group. You will need the alternation operator in many usual regexes such as email, url etc. Hope that helps.

    /(?:http|ftp):\/\/([^\/\r\n]+)(\/[^\r\n]*)?/g is a sample URL regex in JavaScript which needs the alternation operator and needs grouping. Without grouping the match would be just http for all http urls.