Search code examples
c++boost-spirit

Replace lit with different string in boost spirit


I'm trying to parse a quoted string containing escape sequences using boost spirit. I'm looking for a way to replace escape sequences \" with the corresponding character (" in this case). So far I've come up with this.

c_string %= lit('"') >> *(lit("\\\"")[push_back(_val, '"')] | (char_ - '"')) >> lit('"')

with the replacement being done with

lit("\\\"")[push_back(_val, '"')]

however this seems to me quite clumsy and unreadable. Is there a better way to accomplish this?


Solution

  • Iterating: you can replace "\\\"" with '\\' >> lit('"'), reformatting a bit:

    c_string
        %= lit('"')
        >> *(
               '\\' >> lit('"')[push_back(_val, '"')]
             | (char_ - '"')
        )
        >> lit('"')
        ;
    

    Now, you can do away with some of the lit() calls because they're implicit when invoking proto expressions in the Qi domain:

    c_string
        %= '"'
        >> *(
               '\\' >> lit('"')[push_back(_val, '"')]
             | (char_ - '"')
        )
        >> '"'
        ;
    

    Next up, lit(ch)[push_back(_val, ch)] is just a clumsy way to say char_(ch):

    c_string = '"'
        >> *( '\\' >> char_('"') | (char_ - '"') )
        >> '"';
    

    Note now we don't have the kludge of %= either (see Boost Spirit: "Semantic actions are evil"?) and you can leave the phoenix.hpp include(s)

    Finally, you can have a more optimized char_ - char_(xyz) by saying ~char_(xyz):

    c_string = '"' >> *('\\' >> char_('"') | ~char_('"')) >> '"';
    

    Now, you're not actually parsing C-style strings here. You're not handling escapes, so why not simplify:

    c_string = '"' >> *('\\' >> char_|~char_('"')) >> '"';
    

    Note that now you actually parse backslash escapes, which you would otherwise not (you would parse "\\" into "\\" instead of "\")

    If you want to be more precise, consider handling escapes like e.g. Handling utf-8 in Boost.Spirit with utf-32 parser

    Live Demo

    Live On Coliru

    #include <boost/spirit/include/qi.hpp>
    namespace qi = boost::spirit::qi;
    
    int main() {
        const qi::rule<std::string::const_iterator, std::string()> c_string
            = '"' >> *('\\' >> qi::char_|~qi::char_('"')) >> '"';
    
        for (std::string const input: {
                R"("")"               , // ""
                R"("\"")"             , // "\\\""
                R"("Hello \"world\"")", // "Hello \\\"world\\\""
            })
        {
            std::string output;
            if (parse(input.begin(), input.end(), c_string, output)) {
                std::cout << input << " -> " << output << "\n";
            } else {
                std::cout << "Failed: " << input << "\n";
            }
        }
    }
    

    Prints

    "" -> 
    "\"" -> "
    "Hello \"world\"" -> Hello "world"