Search code examples
regexrstringr

Non-capturing group is matched by str_extract


I am trying to use non-capturing groups with the str_extract function from the stringr package. Here is an example:

library(stringr)
txt <- "foo"
str_extract(txt,"(?:f)(o+)")

This returns

"foo"

while i expect it to return only

"oo"

like in this post: https://stackoverflow.com/a/14244553/3750030

How do i use non-capturing groups in R to remove the content of the groups from the returned value while using it for matching?


Solution

  • When you are using regex (?:f)(o+) this won't Capture but it will match it for sure.

    What capturing means is storing in memory for back-referencing, so that it can be used for repeated match in same string or replacing captured string.

    like in this post: https://stackoverflow.com/a/14244553/3750030

    You misunderstood that answer. Non-Capturing groups doesn't means Non-Matching. It's captured in $1 ( group 1 ) because there is no group prior to it.

    If you wish to Only match suppose B followed by A then you should use positive lookbehind like this.

    Regex: (?<=f)(o+)

    Explanation:

    • (?<=f) This will look for f to be present behind the following token but won't match.

    • (o+) This will match and capture as group (here in $1)if previous condition is true.

    Regex101 Demo