Search code examples
regexcapturefreetext

Optional regexp group in freetext


I have two kind of text messages - both have a uniformed message code at the beginning but there might be a second match enclosed in ' characters I would need to extract if it is there.

M0123 Example 'extratext' with two expected matches.

M0321 Example without two matches

This matches #1 and captures both groups but does not match #2:

^(?<code>M\d+).*(?<extra>'.*').*

This matches #1 & #2 but extra group never captured:

^(?<code>M\d+).*(?<extra>'.*')?.*

Solution

  • Negated character classes should help you out here, like

    ^(?<code>M\d+)[^']*(?:(?<extra>'.*').*)?
    

    Transforming the first .* into [^']* will make it match up to the first quote for your first sample and the whole string for your second.

    Notes:

    • if in multiline context, you might want to use [^'\r\n] instead to avoid overlapping lines
    • you can also use (?<extra>'[^']*') if there are always two quotes