Search code examples
c++boost-spiritboost-spirit-qipeg

How can I stop my simple addition grammar terminating early in Boost Spirit?


I'm having difficulty making a toy grammar to parse addition work as desired in Boost Spirit.

Here is my grammar and code:

Syntax.h:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;

void test();

template <typename Iterator>
struct ExpressionGrammar : qi::grammar<Iterator, double(), ascii::space_type>
{
    qi::rule<Iterator, double(), ascii::space_type> expression;
    qi::rule<Iterator, double(), ascii::space_type> addsub;

    ExpressionGrammar()
    : ExpressionGrammar::base_type(expression)
    {
        using qi::lit;
        using qi::_val;
        using qi::_1;
        using qi::_2;

        addsub = (expression >> '+' >> expression)[_val = _1 + _2];
        expression = (qi::double_ | addsub);
    }
};

Syntax.cpp:

#include "Syntax.h"

namespace qi = boost::spirit::qi;

void test()
{
    ExpressionGrammar<const char*> grammar;
    std::string s = "3 + 5";
    const char* c = s.c_str();
    double result = -42;
    bool r = qi::phrase_parse(c, c+strlen(c), grammar, ascii::space, result);
    if (r)
        std::cout << "Success. result: "<<result<<". Still to parse: "<<c<<std::endl;
    else
        std::cout << "Fail. parsing failed at: "<< c <<std::endl;
}

Output:

Success. result: 3. Still to parse: + 5

It appears that double_ consumes the 3 and then there is no rule that can parse just + 5. However if I change my expression rule to

expression = (addsub | qi::double_);

then my program enters an infinite recursion.

What's the solution to this? I am aware that in examples it's more common to use a Kleene star to deal with an arbitrary list of binary combinations (along the lines of expression *('+' >> expression)). Is this a necessity of using Parsing Expression Grammar? If so, please explain why.


Solution

  • Boost Spirit is a recursive descent parser which is not capable of parsing grammars containing left-recursions. Take a look at this Wikipedia article on how to rewrite left-recursive productions. Possible solutions in spirit include using Kleene Star expressions or the list operator (%).