Search code examples
c++boostboost-spiritboost-spirit-qi

How to capture character without consuming it in boost::spirit::qi


I'm using boost::spirit::qi to parse a "template" format that looks something like this:

/path/to/:somewhere:/nifty.json

where :somewhere: represents any string identified by the name somewhere (the name can be any series of characters between two : characters). I have a working parser for this, but I want to make one additional improvement.

I would like to know what character follows the :somewhere: placeholder (in this case a /). But the rest of my parser still needs to know about this / and consume it as part of the next section.

How can I "read" the / after :somewhere: without actually consuming it so that the rest of the parser will see it and consume it.


Solution

  • As sehe mentioned this can be done using the lookahead parser operator &, but if you want to emit the character as well you'll also need boost.phoenix, qi::locals and qi::attr.

    For example:

    #include <boost/fusion/include/std_pair.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/spirit/include/qi.hpp>
    
    #include <iostream>
    #include <string>
    
    namespace qi = boost::spirit::qi;
    
    int main(int argc, char** argv)
    {
        std::string input("foo:/bar");
        std::pair<char, std::string> output;
    
        std::string::const_iterator begin = input.begin(),
                                    end = input.end();
    
        qi::rule<std::string::const_iterator, qi::locals<char>, std::pair<char, std::string>()> duplicate =
              "foo"
           >> qi::omit[
                 &(":" >> qi::char_[qi::_a = qi::_1])
              ]
           >> qi::attr(qi::_a)
           >> ":"
           >> *qi::char_;
    
        bool r = qi::parse(begin,
                           end,
                           duplicate,
                           output);
    
        std::cout << std::boolalpha
                  << r << " "
                  << (begin == end) << " '"
                  << output.first << "' \""
                  << output.second << "\""
                  << std::endl;
    
        return 0;
    }
    

    This outputs:

    true true '/' "/bar"