Search code examples
regexmatchconditional-statementsregex-groupnamed

Regex conditionnal group name


I can't seem to find any information about this, so I'm unsure if this is possible or not, but here goes:

Is there a way to have multiple options for the name of a matching group? I acquire parameters out of a code, and I use the regex group names to use them afterwards. However, my sequence has multiple (very very similar) formats, and the order of parameters changes. Hence, my question is: can a group have a different name if another group doesn't match?

Example: (?'type'A|B|C)-(?'length_or_diameter'\d+)(?:x(?'length'\d+))?

Code formats: (type)-(length) OR (type)-(diameter)x(length)

See here

There are ways for me to fiddle around this in code, but I think it would be much more elegant if it could be dealt with in the regex itself. Therefore, here is my question: is there a way for group 2 (length_or_diameter) to be named either "length" if group 3 has no match or "diameter" if group 3 does have a match, rather than being named length_or_diameter and requiring more logic in code?


Solution

  • You can only use one name for a named capturing group and cannot change it dynamically after the pattern is created.

    You may use identically named groups (if your regex engine supports them, as Onigmo in Ruby, the .NET regex library or PCRE with J option on):

    (?'type'A|B|C)-(?:(?'diameter'\d+)x(?'length'\d+)|(?'length'\d+))
    

    See the regex101 PCRE demo. Here is a variation with a branch reset group, (?|...|...):

    (?'type'A|B|C)-(?|(?'diameter'\d+)x(?'length'\d+)|()(?'length'\d+))
    

    See the regex demo (won't work in .NET though).

    Another workaround is to play around with the lookarounds and optional groups:

    (?'type'A|B|C)-(?:(?'diameter'\d+)x)?(?'length'\d+)?
    

    See another regex demo. This one matches

    • (?'type'A|B|C) - A, B or C in Group "type"
    • - - a -
    • (?:(?'diameter'\d+)x)? - an optional non-capturing group matching
      • (?'diameter'\d+) - 1 or more digits in Group diameter
      • x - an x char
    • (?'length'\d+)? - an optional capturing group "length", 1+ digits.