Search code examples
c++regexpattern-matchingstring-literals

Extracting ONLY specific parts of a regex 'expression'


I have a list of expressions that I would like to validate, and extract specific parts of.
These expressions are allowed to have any combination of:

  • String literals (possibly escaped, but not necessarily), denoted by single quotes
  • Any number of characters which are NOT string literals and
    are NOT a line-end character, denoted by a semicolon

Valid expressions would start after a colon, and end with a semi-colon.
An example of a valid expression would be:

: This is an *expression* 'with' and 'without \'escaped\' string literals', 
which ends with a semicolon!;

And out of that expression, I would like to extract:

  • This is an *expression*
  • 'with'
  • and
  • 'without \'escaped\' string literals'
  • , which ends with a semicolon!

Is this possible?


Solution

  • Spoke to RectangleEquals, the answer is std::regex re_("'(?:\\.|[^'])*'|[^']+");