Search code examples
c++boost-spiritboost-spirit-qi

Extremely simple parser with boost::spirit fails


I always fail with spirit parsers, this time I am trying to figure out how to achieve an extremely simple goal: Parse three words into a tuple of 3 strings - I don't even manage to get this working and feel very stupid now. Its about this little bit of code:

#include <boost/fusion/include/std_tuple.hpp>
#include <boost/spirit/home/qi/parse.hpp>
#include <boost/spirit/include/qi.hpp>

#include <iostream>
#include <string_view>
#include <tuple>

namespace {
    using namespace boost::spirit;
}

int main () {

    const std::string_view content {"some second last"};
    std::tuple<std::string, std::string, std::string> words;

    auto atInput = content.begin ();
    const auto endOfInput = content.end ();

    bool result = qi::phrase_parse (atInput,
                                    endOfInput,
                                    +qi::char_ >> +qi::char_ >> +qi::char_,
                                    qi::ascii::space,
                                    words);

    if (result) {
        const auto& [first, second, last] = words;
        std::cout << "Parsing succeeded: " << first << " - " << second << " - " << last
                  << std::endl;
    }
    else {
        std::cout << "Error in parsing.\n";
    }
}

I get Error in parsing and I have no clue why. What is wrong with this simple example?


Solution

  • Your skipper removes the spaces. Therefore, the parser really sees "somesecondlast". That matches the first +char_, which means the entire expression fails to parse because EOI is reached.

    You can make this explicitly visible by using rule debugging:

    Live On Coliru

    #define BOOST_SPIRIT_DEBUG 1
    #include <boost/fusion/include/std_tuple.hpp>
    #include <boost/spirit/include/qi.hpp>
    #include <iomanip>
    
    namespace qi = boost::spirit::qi;
    using Tuple  = std::tuple<std::string, std::string, std::string>;
    using It     = std::string_view::const_iterator;
    
    int main() {
        qi::rule<It, std::string(), qi::space_type> word_ = +qi::char_;
        qi::rule<It, Tuple(), qi::space_type>       rule  = word_ >> word_ >> word_;
        BOOST_SPIRIT_DEBUG_NODES((word_)(rule));
    
        for (std::string_view content : {"some second last"}) {
            Tuple words;
            auto  f = content.begin(), l = content.end();
            if (qi::phrase_parse(f, l, rule, qi::space, words)) {
                auto const& [w1, w2, w3] = words;
                std::cout << "Parsing succeeded: " << quoted(w1) << " - " << quoted(w2) << " - " << quoted(w3) << std::endl;
            } else {
                std::cout << "Error in parsing.\n";
            }
        }
    }
    

    Printing

    <rule>
      <try>some second last</try>
      <word_>
        <try>some second last</try>
        <success></success>
        <attributes>[[s, o, m, e, s, e, c, o, n, d, l, a, s, t]]</attributes>
      </word_>
      <word_>
        <try></try>
        <fail/>
      </word_>
      <fail/>
    </rule>
    Error in parsing.
    

    Lexemes

    You can get around this in many ways. I describe how lexemes, skippers and non-terminal parsers (like qi::rule) interact here: Boost spirit skipper issues

    In my debug example, the simplest would be to drop the skipper from the word_ rule declaration:

    qi::rule<It, std::string() /*, qi::space_type*/> word_ = +qi::graph;
    

    Note that qi::graph must now be used to avoid matching space

    Printing Live On Coliru:

    Parsing succeeded: "some" - "second" - "last"
    

    Other approaches, perhaps without the rules, may be:

    if (qi::phrase_parse(                                                               //
            f, l,                                                                       //
            qi::lexeme[+qi::graph] >> qi::lexeme[+qi::graph] >> qi::lexeme[+qi::graph], //
            qi::space,                                                                  //
            words))
    

    Or not using a skipper in the first place:

        if (qi::parse(                                                                                  //
                f, l,                                                                                   //
                +qi::graph >> qi::omit[*qi::space] >> +qi::graph >> qi::omit[*qi::space] >> +qi::graph, //
                words))
    

    I'd personally consider simplifying down to something like: https://coliru.stacked-crooked.com/a/b0dce7bb8be31f9f