How to write binary operator with two post operands syntax with Boost Spirit x3?

I am following this example: https://github.com/boostorg/spirit/blob/develop/example/x3/calc/calc9/expression_def.hpp

What I am trying to accomplish is to write a rule that parses and generates like min{x}{y}. Mostly the code is using expression grammar like x + y, but now I want to place and parse both operands to the rhs of the operator.

I added the following code in expression_def.hpp file:

    ...
    x3::symbols<ast::optoken> additive_op;
    x3::symbols<ast::optoken> multiplicative_op;
    x3::symbols<ast::optoken> binarypost_op;
    x3::symbols<ast::optoken> unary_op;
    x3::symbols<> keywords;
    ...

    binarypost_op.add
        ("min", ast::op_divide) // Dummy operation usage for now
        ;
    ...
    struct binarypost_expr_class;
    struct unary_expr_class; 
    ...
    typedef x3::rule<binarypost_expr_class, ast::expression> 
    binarypost_expr_type;
    ...
    binarypost_expr_type const binarypost_expr = "binarypost_expr";
    ... 

    auto const multiplicative_expr_def =
    binarypost_expr
    >> *(multiplicative_op > binarypost_expr)
    ;
    auto const binarypost_expr_def =           // See the chaining operation
    ('{' > unary_expr > '}')
    >> *(binarypost_op > ('{' > unary_expr > '}'))
    ;
    auto const unary_expr_def =
        primary_expr
    |   (unary_op > primary_expr)
    ;

This works fine. But it can only parse something like , {x} min {y}. I want to be able to parse min {x} {y}. I tried the many combinations such as :

binarypost_op >> ('{' > unary_expr > '}') > ('{' > unary_expr > '}') etc. But I cant seem to figure it out as to what is the right way to write this? Any suggestions / comments ?

Solution

Ok, here's the changes. The hard part is actually code-generating the builtin function.

Parsing

Step 1: extend AST

Always start with the AST. We want operands that can be function calls:

In ast.hpp:

struct function_call;  // ADDED LINE

// ...

struct operand :
    x3::variant<
        nil
      , unsigned int
      , variable
      , x3::forward_ast<unary>
      , x3::forward_ast<expression>
      , x3::forward_ast<function_call> // ADDED LINE
    >
{
    using base_type::base_type;
    using base_type::operator=;
};

// ...

enum funtoken
{
    fun_min,
    fun_max,
};

// ...

struct function_call : x3::position_tagged
{
    funtoken fun;
    std::list<operand> args;
};

In ast_adapted.hpp:

BOOST_FUSION_ADAPT_STRUCT(client::ast::function_call,
    fun, args
)

Step 2: extend grammar

(This is all in expression_def.hpp)

Let's be generic, so parse function name tokens using a symbol table:

x3::symbols<ast::funtoken> functions;

Which we have to initialize in add_keywords:

functions.add
    ("min", ast::fun_min)
    ("max", ast::fun_max)
    ;

Now declare a rule for function calls:

struct function_call_class;
typedef x3::rule<function_call_class, ast::function_call>    function_call_type;
function_call_type const function_call = "function_call";

That's all red-tape. The "interesting thing" is the rule definition:

auto const function_call_def =
        functions
    >>  '(' >> expression % ',' >> ')'
    ;

Well. That's underwhelming. Let's integrate into our primary expression rule:

auto const primary_expr_def =
        uint_
    |   bool_
    |   function_call
    |   (!keywords >> identifier)
    |   ('(' > expression > ')')
    ;

Note the ordering. If you want to be able to add function names that collide with a keyword, you'll need to add precautions.

Also, lets make AST annotation work for our node:

struct function_call_class : x3::annotate_on_success {};

Code generation

It's easy to find where to add support for the new AST node:

In compiler.hpp:

 bool operator()(ast::function_call const& x) const;

Now comes the hard part.

What's really required for general n-ary is an accumulator. Since we don't have registers, this would need to be a temporary (local). However, since the VM implementation doesn't have these, I've limited the implementation to a fixed binary function call only.

Note that the VM already has support for function calls. Functions can have locals. So, if you code-gen a variable-argument built-in function you can implement a left-fold recursive solution.

In compiler.cpp:

bool compiler::operator()(ast::function_call const& x) const
{
    auto choice = [&](int opcode) {
        BOOST_ASSERT(x.args.size() == 2); // TODO FIXME hardcoded binary builtin
        auto it = x.args.begin();

        auto& a = *it++;
        if (!boost::apply_visitor(*this, a))
            return false;

        auto& b = *it++;
        if (!boost::apply_visitor(*this, b))
            return false;
        program.op(opcode); // the binary fold operation

        program.op(op_jump_if, 0);
        size_t const branch = program.size()-1;

        if (!boost::apply_visitor(*this, a))
            return false;
        program.op(op_jump, 0);
        std::size_t continue_ = program.size()-1;

        program[branch] = int(program.size()-branch);
        if (!boost::apply_visitor(*this, b))
            return false;

        program[continue_] = int(program.size()-continue_);
        return true;
    };

    switch (x.fun) {
        case ast::fun_min: return choice(op_lt);
        case ast::fun_max: return choice(op_gt);
        default: BOOST_ASSERT(0); return false;
    }
    return true;
}

I've just taken inspiration from the surrounding code on how to generate the jump labels.

Trying It Out

A simplistic example would be: var x = min(1,3);

Assembler----------------

local       x, @0
start:
      op_stk_adj  1
      op_int      1
      op_int      3
      op_lt
      op_jump_if  13
      op_int      1
      op_jump     15
13:
      op_int      3
15:
      op_store    x
end:
-------------------------
Results------------------

    x: 1
-------------------------

Running it with some random contrived input:

./test <<< "var a=$(($RANDOM % 100)); var

b=$(($RANDOM % 100)); var contrived=min(max(27,2*a), 100+b);"

Prints e.g.:

Assembler----------------

local       a, @0
local       b, @1
local       contrived, @2
start:
      op_stk_adj  3
      op_int      31
      op_store    a
      op_int      71
      op_store    b
      op_int      27
      op_int      2
      op_load     a
      op_mul
      op_gt
      op_jump_if  24
      op_int      27
      op_jump     29
24:
      op_int      2
      op_load     a
      op_mul
29:
      op_int      100
      op_load     b
      op_add
      op_lt
      op_jump_if  58
      op_int      27
      op_int      2
      op_load     a
      op_mul
      op_gt
      op_jump_if  51
      op_int      27
      op_jump     56
51:
      op_int      2
      op_load     a
      op_mul
56:
      op_jump     63
58:
      op_int      100
      op_load     b
      op_add
63:
      op_store    contrived
end:
-------------------------
Results------------------

    a: 31
    b: 71
    contrived: 62
-------------------------