Search code examples
c++boostboost-regex

boost::regex_constants::error_type to string


I have a function checks if an expression matches a regex, and returns a boost::regex_constants::error_type, it logs the error in case of exception:

boost::regex_constants::error_type RegexMatch(const std::string& p_expression, const std::string& p_pattern)
{
    boost::regex_constants::error_type returnValue = boost::regex_constants::error_unknown;

    try
    {
        if (boost::regex_match(p_expression, boost::regex(p_pattern)))
        {
            returnValue = boost::regex_constants::error_ok;
        }
        else
        {
            returnValue = boost::regex_constants::error_no_match;
        }
    }
    catch(boost::regex_error& e)
    {
        returnValue = e.code();
        LOG_ERROR("Error checking if [%s] expression matches pattern [%s]: boost::regex_error [%s]",
                  p_expression.c_str(),
                  p_pattern.c_str(),
                  e.what());
    }

    return returnValue;
}

But in client side, caller gets only boost::regex_constants::error_type as result, while, depending on the context, client may want to display a "human readable" error.

Now my question is to know if there is a native boost function to do that ? Because I couldn't find one, I then have done my own function:

std::string BoostRegexErrorTypeToString(const boost::regex_constants::error_type p_boostRegexErrorType)
{
    return boost::regex_error(p_boostRegexErrorType).what();
}

Note that I return a std::string instead of directly a const char* (return by what()) because when returning a const char*, for some error types, for example error_ok, "et*)" is returned instead of "Success".

And finally, to test this code you can use following loop:

for (int intErrorType = boost::regex_constants::error_ok; // error_ok is the first
     intErrorType <= boost::regex_constants::error_unknown; // error_unknown is the last
     ++intErrorType)
{
    const boost::regex_constants::error_type errorType = (boost::regex_constants::error_type)intErrorType;
    LOG_DEBUG("Regex error [%d] text is [%s]",
              errorType,
              BoostRegexErrorTypeToString(errorType).c_str());
}

Thanks


Solution

  • First off, the reason that

    because when returning a const char*, for some error types, for example error_ok, "et*)" is returned instead of "Success".

    was happening was because you were returning a stale pointer (what() points to member data inside a runtime_error instance, and it would be destructed after returning from your function!).


    The function boost uses to get the error string is not public:

    regex_error::regex_error(regex_constants::error_type err) 
       : std::runtime_error(::boost::BOOST_REGEX_DETAIL_NS::get_default_error_string(err))
       , m_error_code(err)
       , m_position(0) 
    {
    }
    

    You can see that ::boost::BOOST_REGEX_DETAIL_NS::get_default_error_string(err) is in a detail namespace.

    If you want, you can read the source anyways:

    BOOST_REGEX_DECL const char* BOOST_REGEX_CALL get_default_error_string(regex_constants::error_type n)
    {
       static const char* const s_default_error_messages[] = {
          "Success",                                                            /* REG_NOERROR 0 error_ok */
          "No match",                                                           /* REG_NOMATCH 1 error_no_match */
          "Invalid regular expression.",                                        /* REG_BADPAT 2 error_bad_pattern */
          "Invalid collation character.",                                       /* REG_ECOLLATE 3 error_collate */
          "Invalid character class name, collating name, or character range.",  /* REG_ECTYPE 4 error_ctype */
          "Invalid or unterminated escape sequence.",                           /* REG_EESCAPE 5 error_escape */
          "Invalid back reference: specified capturing group does not exist.",  /* REG_ESUBREG 6 error_backref */
          "Unmatched [ or [^ in character class declaration.",                  /* REG_EBRACK 7 error_brack */
          "Unmatched marking parenthesis ( or \\(.",                            /* REG_EPAREN 8 error_paren */
          "Unmatched quantified repeat operator { or \\{.",                     /* REG_EBRACE 9 error_brace */
          "Invalid content of repeat range.",                                   /* REG_BADBR 10 error_badbrace */
          "Invalid range end in character class",                               /* REG_ERANGE 11 error_range */
          "Out of memory.",                                                     /* REG_ESPACE 12 error_space NOT USED */
          "Invalid preceding regular expression prior to repetition operator.", /* REG_BADRPT 13 error_badrepeat */
          "Premature end of regular expression",                                /* REG_EEND 14 error_end NOT USED */
          "Regular expression is too large.",                                   /* REG_ESIZE 15 error_size NOT USED */
          "Unmatched ) or \\)",                                                 /* REG_ERPAREN 16 error_right_paren NOT USED */
          "Empty regular expression.",                                          /* REG_EMPTY 17 error_empty */
          "The complexity of matching the regular expression exceeded predefined bounds.  "
          "Try refactoring the regular expression to make each choice made by the state machine unambiguous.  "
          "This exception is thrown to prevent \"eternal\" matches that take an "
          "indefinite period time to locate.",                                  /* REG_ECOMPLEXITY 18 error_complexity */
          "Ran out of stack space trying to match the regular expression.",     /* REG_ESTACK 19 error_stack */
          "Invalid or unterminated Perl (?...) sequence.",                      /* REG_E_PERL 20 error_perl */
          "Unknown error.",                                                     /* REG_E_UNKNOWN 21 error_unknown */
       };
    
       return (n > ::boost::regex_constants::error_unknown) ? s_default_error_messages[ ::boost::regex_constants::error_unknown] : s_default_error_messages[n];