Search code examples
javascriptregexregex-lookaroundsregex-group

How to exclude matched group after a non-matched group - regex


I wrote a regex which matches a string using regex groups pattern:

^(?<serialCode>[a-zA-Z0-9]{0,3})(?<serialMarket>[a-zA-Z]{0,2})(?<serialSuffix>[a-zA-Z0-9]*)$

Basically it says:

  • 1st group should be 3 characters long and contain only alphanumeric characters
  • 2nd group should be 2 characters long and contain only letters
  • last group can be any length and contain alphanumeric characters

This translates to:

Match 1

Full match 0-8 abcfobar

Group serialCode 0-3 abc

Group serialMarket 3-5 fo

Group serialSuffix 5-8 bar

enter image description here

The above case is expected result.

When the regex fails as it should for a string like: abc33bar, it fails because 4th an 5th characters are digits instead of letters, this is correct. The issue is that the characters which should match the second group move to the next matching group(serialSuffix), and it results in:

Match 1

Full match 0-8 abc33bar

Group serialCode 0-3 abc

Group serialMarket 3-3

Group serialSuffix 3-8 33bar

enter image description here

How do I prevent creation of non-matching and matching groups after the non-matched group(including non-matched group)?


Solution

  • You may try this regex with a lookbehind in last optional capture group:

    ^(?<serialCode>[a-zA-Z0-9]{3})(?:(?<serialMarket>[a-zA-Z]{1,2})(?<serialSuffix>(?<=^.{5})[a-zA-Z0-9]*)?)?
    

    RegEx Demo

    RegEx Details:

    • ^: Start
    • (?<serialCode>[a-zA-Z0-9]{3}): Match and capture 3 alphanumerics in serialCode capture group
    • (?:: Start non-capture group
      • (?<serialMarket>[a-zA-Z]{1,2}): Match and capture 1 or 2 letters in serialMarket capture group
      • (?<serialSuffix>(?<=^.{5})[a-zA-Z0-9]*)?: Match and capture 0 or more alphanumerics in optional serialSuffix capture group. This group will capture only after first 5 characters using lookbehind assertion (?<=^.{5})
    • )?: End non-capture group (optional)