Search code examples
c++cstringescaping

How to (un)escape strings in C/C++?


Given a counted string (either an array of characters, or a wrapper like std::string), is there a "proper" way to escape and/or unescape it in C or C++, such that "special" characters (like the null character) become C-style-escaped and "normal" characters stay the way they are?

Or do I have to do it by hand?


Solution

  • This is a function to process a single character:

    /*
    ** Does not generate hex character constants.
    ** Always generates triple-digit octal constants.
    ** Always generates escapes in preference to octal.
    ** Escape question mark to ensure no trigraphs are generated by repetitive use.
    ** Handling of 0x80..0xFF is locale-dependent (might be octal, might be literal).
    */
    
    void chr_cstrlit(unsigned char u, char *buffer, size_t buflen)
    {
        if (buflen < 2)
            *buffer = '\0';
        else if (isprint(u) && u != '\'' && u != '\"' && u != '\\' && u != '\?')
            sprintf(buffer, "%c", u);
        else if (buflen < 3)
            *buffer = '\0';
        else
        {
            switch (u)
            {
            case '\a':  strcpy(buffer, "\\a"); break;
            case '\b':  strcpy(buffer, "\\b"); break;
            case '\f':  strcpy(buffer, "\\f"); break;
            case '\n':  strcpy(buffer, "\\n"); break;
            case '\r':  strcpy(buffer, "\\r"); break;
            case '\t':  strcpy(buffer, "\\t"); break;
            case '\v':  strcpy(buffer, "\\v"); break;
            case '\\':  strcpy(buffer, "\\\\"); break;
            case '\'':  strcpy(buffer, "\\'"); break;
            case '\"':  strcpy(buffer, "\\\""); break;
            case '\?':  strcpy(buffer, "\\\?"); break;
            default:
                if (buflen < 5)
                    *buffer = '\0';
                else
                    sprintf(buffer, "\\%03o", u);
                break;
            }
        }
    }
    

    And this is the code to handle a null-terminated string (using the function above):

    void str_cstrlit(const char *str, char *buffer, size_t buflen)
    {
        unsigned char u;
        size_t len;
    
        while ((u = (unsigned char)*str++) != '\0')
        {
            chr_cstrlit(u, buffer, buflen);
            if ((len = strlen(buffer)) == 0)
                return;
            buffer += len;
            buflen -= len;
        }
        *buffer = '\0';
    }