Search code examples
regexregex-lookarounds

PCRE Regex: Exclude last portion of word


I am trying to write a regex expression in PCRE which captures the first part of a word and excludes the second portion. The first portion needs to accommodate different values depending upon where the transaction is initiated from. Here is an example:

Raw Text:

.controller.CustomerDemographicsController

Regex Pattern Attempted:

\.controller\.(?P<Controller>\w+)

Results trying to achieve (in bold is the only content I want to save in the named capture group):

.controller.CustomerDemographicsController

NOTE: I've attempted to exclude using ^, lookback, and lookforward.

Any help is greatly appreciated.


Solution

  • You can match word chars in the Controller group up to the last uppercase letter:

    \.controller\.(?P<Controller>\w+)(?=\p{Lu})
    

    See the regex demo. Details:

    • \.controller\. - a .controller\. string
    • (?P<Controller>\w+) - Named capturing group "Controller": one or more word chars as many as possible
    • (?=\p{Lu}) - the next char must be an uppercase letter.

    Note that (?=\p{Lu}) makes the \w+ stop before the last uppercase letter because the \w+ pattern is greedy due to the + quantifier.