Search code examples
regexregex-lookaroundsregex-group

Regex capture part of text or all text depending on conditions


Hi guys I am working on some regex to extract the FirstName and first letter of LastName from an account field, only if the account is a human account. Otherwise, if it is a service account I want to capture the full text in the same capturing group where I'd capture the FirstName.

Sounds simple enough but I am working with a naming convention that is not as well defined as I'd like which makes things hard. Here is the regex I have put together so far.

^((-?)(admin|top)-|ADMIN: )?(?<FirstName>[^\n]+?)(?(?= \w* ) (?<MiddleName>\w*)|(?= ?))(?(?= ) (?<LastName>[A-Z])|$)

Below are some examples of account names. I've highlighted with bold the parts that need to be captured by the regex.

  • -admin-JohnS (here I want to grab JohnS in the FirstName capturing group)
  • admin-JohnS
  • ADMIN: John Smith (here the S needs to be captured by the LastName capturing group)
  • -top-JohnS
  • top-JohnS
  • John Smith
  • John Peter Smith
  • -service-something
  • service-something
  • -svc-something
  • svc-something
  • svc.something
  • SERVICE: Something
  • someServiceAccount

The regex works great for almost all the occasions except for "SERVICE: Something" where "SERVICE:" is captured by the FirstName group and "S" is captured by the LastName capturing group. What can I do to fix this and capture all the text? I have tried a few things with negative lookaheads but didn't get me anywhere.

Thanks!


Solution

  • You might use:

    ^(?:-?(?i:admin(?:: |-)|top-))?(?<FirstName>[^\s:]+(?:: .*)?)(?: (?<MiddleName>\w+(?= \w)))?(?: (?<LastName>[A-Z]))?
    

    Explanation

    • ^ Start of string
    • (?: Non capture group
      • -?(?i:admin(?:: |-)|top-) Match optional - and either admin: or admin- or top- case insensitive
    • )? Close the non capture group and make it optional
    • (?<FirstName>[^\s:]+(?:: .*)?) Match the first name consisting of any non whitespace char except : and then optionally match : and the rest of the line
    • (?: (?<MiddleName>\w+(?= \w)))? Match only the middle name if there is a following part with a space and a word char
    • (?: (?<LastName>[A-Z]))? Optionally match a space and the first char of the last name

    Regex demo