Search code examples
c++parsingboostboost-spiritboost-spirit-qi

C++ Boost spirit, multiple evaluations of a same _val?


I tried to modify a boolean expression parser from here : Boolean expression (grammar) parser in c++

During the creation of variables, I try to call a new function, "processval". It's just the identity function with some output to see when this function is called. But the thing is, this function is called more than 10 times when I only have 2 variables. Does anybody understand why ?

The code is here

#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <iostream>
#include <string>
#include <boost/phoenix/function/adapt_function.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/variant/recursive_wrapper.hpp>

namespace qi    = boost::spirit::qi;
namespace phx   = boost::phoenix;

struct op_or  {};
struct op_and {};
struct op_xor {};
struct op_not {};

typedef int var;
template <typename tag> struct binop;
template <typename tag> struct unop;

typedef boost::variant<var, 
        boost::recursive_wrapper<unop <op_not> >, 
        boost::recursive_wrapper<binop<op_and> >,
        boost::recursive_wrapper<binop<op_xor> >,
        boost::recursive_wrapper<binop<op_or> >
        > expr;

template <typename tag> struct binop 
{ 
    explicit binop(const expr& l, const expr& r) : oper1(l), oper2(r) { }
    expr oper1, oper2; 
};

template <typename tag> struct unop  
{ 
    explicit unop(const expr& o) : oper1(o) { }
    expr oper1; 
};

struct printer : boost::static_visitor<void>
{
    printer(std::ostream& os) : _os(os) {}
    std::ostream& _os;

    //
    void operator()(const var& v) const { _os << v; }

    void operator()(const binop<op_and>& b) const { print(" & ", b.oper1, b.oper2); }
    void operator()(const binop<op_or >& b) const { print(" | ", b.oper1, b.oper2); }
    void operator()(const binop<op_xor>& b) const { print(" ^ ", b.oper1, b.oper2); }

    void print(const std::string& op, const expr& l, const expr& r) const
    {
        _os << "(";
            boost::apply_visitor(*this, l);
            _os << op;
            boost::apply_visitor(*this, r);
        _os << ")";
    }

    void operator()(const unop<op_not>& u) const
    {
        _os << "(";
            _os << "!";
            boost::apply_visitor(*this, u.oper1);
        _os << ")";
    }
};

std::ostream& operator<<(std::ostream& os, const expr& e)
{ boost::apply_visitor(printer(os), e); return os; }

int processval(int s)
{
    std::cout << "processing val : " << s << std::endl;
    return s;
}

BOOST_PHOENIX_ADAPT_FUNCTION(int, process_, processval, 1)

template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, expr(), Skipper>
{
    parser() : parser::base_type(expr_)
    {
        using namespace qi;

        expr_  = or_.alias();

        or_  = (xor_ >> "or"  >> or_ ) [ _val = phx::construct<binop<op_or >>(_1, _2) ] | xor_   [ _val = _1 ];
        xor_ = (and_ >> "xor" >> xor_) [ _val = phx::construct<binop<op_xor>>(_1, _2) ] | and_   [ _val = _1 ];
        and_ = (not_ >> "and" >> and_) [ _val = phx::construct<binop<op_and>>(_1, _2) ] | not_   [ _val = _1 ];
        not_ = ("not" > simple       ) [ _val = phx::construct<unop <op_not>>(_1)     ] | simple [ _val = _1 ];

        simple = (('(' > expr_ > ')') | var_);
        var_ = int_ [ _val = process_(_1) ];

    }

  private:
    qi::rule<It, var() , Skipper> var_;
    qi::rule<It, expr(), Skipper> not_, and_, xor_, or_, simple, expr_;
};

int main()
{
    for (auto& input : std::list<std::string> {
        "1 and 2"
            })
    {
        auto f(std::begin(input)), l(std::end(input));
        parser<decltype(f)> p;

        try
        {
            expr result;
            bool ok = qi::phrase_parse(f,l,p,qi::space,result);

            if (!ok)
                std::cerr << "invalid input\n";
            else
                std::cout << "result: " << result << "\n";

        } catch (const qi::expectation_failure<decltype(f)>& e)
        {
            std::cerr << "expectation_failure at '" << std::string(e.first, e.last) << "'\n";
        }

        if (f!=l) std::cerr << "unparsed: '" << std::string(f,l) << "'\n";
    }

    return 0;
}

EDIT :

The best solution is to change the grammar. See Bill's comment


Solution

  • This has to do with how a parser like spirit works. Let's look at where these calls to processval() are happening:

    Let's use a simpler grammar to talk about what happens:

    A = (B >> C) | (B >> B >> C)
    B = "B" [ process_(_1) ]
    C = "C"
    

    Given the input "BBC", we will actually parse B three times! This means that any semantic actions like _process that we have attached will also be executed three times.

    Let's walk the parsing to show why this happens:

    1. We start at A.
    2. The first expression in A that we are supposed to try is B >> C.
    3. The first B in that expression parses successfully. So we execute the semantic action.
    4. Then we are supposed to follow that B with a C. This parse fails because the next element in our string is B.
    5. We then walk up our parse tree until we find that alternative in the A rule.
    6. Now, we have the option of A = B >> B >> C. Let's try this.
    7. The first B parses successfully. So we call our semantic action for the second time.
    8. The second B parses successfully. So we call our semantic action for the third time.
    9. The C parses successfully.
    10. That means A has parsed successfully.