Search code examples
regexpcre

regex negative lookahead in capture group


Im trying to capture a string of upper case characters but want to ignore if there is a lower case letter after the upper case letter

For example for the string ABC ABC ABC ABc it should capture ABC ABC ABC only because theres a lower case 'c' after the last AB

I tried ([A-Z ]+) which captures ABC ABC ABC AB,

How do i add a negative lookahead in this context?

https://regex101.com/r/j8Arzu/12


Solution

  • Try putting [A-Z]\b at the end, to ensure that the match ends at a word boundary to the right of a word, without matching unnecessary spaces either. Note that there's no need for a capturing group here, you can leave that out entirely.

    [A-Z ]+[A-Z]\b
    

    https://regex101.com/r/j8Arzu/13

    If the capital-letter substring may start with a space, then use the same technique at the beginning of the string - lead with \b[A-Z]:

    \b[A-Z][A-Z ]*[A-Z]\b
    

    If you additionally may have only a single character matched, then put the second and third character sets [A-Z ]*[A-Z] into an optional group:

    \b[A-Z](?:[A-Z ]*[A-Z])?\b