Search code examples
phpregexboostboost-regexhtml-escape-characters

Boost regexp how to translate such PHP unescape function to C++?


Whe I had to create CMS in PHP I created simple unescape html function that looked like this:

function unescape($s) {
    $s= preg_replace('/%u(....)/', '&#x$1;', $s);
    $s= preg_replace('/%(..)/', '&#x$1;', $s);
return $s;
}

How to translate it into C++ using Boost.Regex?


Solution

  • I'd guess it would look a bit like this:

    std::string unescape(const std::string s)
    {
      std::string temp = boost::regex_replace(s, "%u(....)", "&#x$1;", boost::match_default);
      temp = boost::regex_replace(temp, "%u(..)", "&#x$1;", boost::match_default);
      return temp;
    }
    

    But I assume the . (DOT) should only match hexadecimal values, in which case I'd go for something like this instead:

    std::string unescape(const std::string s)
    {
      return boost::regex_replace(s, "%u([0-9a-fA-F]{2}|[0-9a-fA-F]{4})", "&#x$1;",
                                  boost::match_default);
    }
    

    (note that I did not test this!)