Search code examples
c++regexstringreplacesemantics

Replacing the contents of an expression in C++ [std::regex]


Let's assume we have the string ONE|(TWO|(THREE|FOUR))...
Knowing that std::regex does not support recursion, how can we break this string down into an std::vector of strings, which (in order) contains:

  • THREE|FOUR
  • TWO|{0}
  • ONE|{1}

The purpose of transforming this in the preceding manner is to create a traversable expression list, which should semantically represent a nested if/then statement. How can this be achieved?


Solution

  • Since you want the inner most content first you can make use of lazy/ungreedy RegEx behaviour and match everything till a ) with (.*?)\) or everything but round brackets with \([^\)\(]+.

    Pseudocode:

    while ( regex_match(string, regex) ) {
        add matches to vector
        replace matches in string with vector index in curly brackets
    }
    

    Example RegEx: ((?:\(|^)[^\)\(]+(?:\)|$))
    RegEx demo here: http://regex101.com/r/pJ4pO7