Search code examples
regexpcre

Regex global capture of groups preceded by keyword


There are lots of examples of capturing a word if NOT preceded by some keyword. I am trying to capture all the groups of digits within parentheses but only if it is preceded by the words "not allocated"

Job is not allocated to your organization (83) vs (1098), please contact support

This is the subject line. I want to capture both '83' and '1098' but only if the words "not allocated" appears before any of the capture groups.

A wanted to use a lookbehind but the ? quantifier can't be used between the lookbehind and the capture group:-

(?<=not allocated)?\((\d+)\)

Any assistance greatly appreciated. Expression needs to be PCRE (PHP) compatible.


Solution

  • At the beginning of the pattern, either match not allocated, or match the end of the previous match with \G. You may also enable the full match to be the matched digits substrings by using \K and lookahead for ), making the capturing group unnecessary:

    (?:(?!^)\G|not allocated).*?\(\K\d+(?=\))
    

    https://regex101.com/r/DboPrU/2

    The negative lookahead for ^ is needed to ensure the \G doesn't match the beginning of the string as well.