I am following this example: https://github.com/boostorg/spirit/blob/develop/example/x3/calc/calc9/expression_def.hpp
What I am trying to accomplish is to write a rule that parses and generates like min{x}{y}. Mostly the code is using expression grammar like x + y, but now I want to place and parse both operands to the rhs of the operator.
I added the following code in expression_def.hpp file:
...
x3::symbols<ast::optoken> additive_op;
x3::symbols<ast::optoken> multiplicative_op;
x3::symbols<ast::optoken> binarypost_op;
x3::symbols<ast::optoken> unary_op;
x3::symbols<> keywords;
...
binarypost_op.add
("min", ast::op_divide) // Dummy operation usage for now
;
...
struct binarypost_expr_class;
struct unary_expr_class;
...
typedef x3::rule<binarypost_expr_class, ast::expression>
binarypost_expr_type;
...
binarypost_expr_type const binarypost_expr = "binarypost_expr";
...
auto const multiplicative_expr_def =
binarypost_expr
>> *(multiplicative_op > binarypost_expr)
;
auto const binarypost_expr_def = // See the chaining operation
('{' > unary_expr > '}')
>> *(binarypost_op > ('{' > unary_expr > '}'))
;
auto const unary_expr_def =
primary_expr
| (unary_op > primary_expr)
;
This works fine. But it can only parse something like , {x} min {y}. I want to be able to parse min {x} {y}. I tried the many combinations such as :
binarypost_op >> ('{' > unary_expr > '}') > ('{' > unary_expr > '}') etc. But I cant seem to figure it out as to what is the right way to write this? Any suggestions / comments ?
Ok, here's the changes. The hard part is actually code-generating the builtin function.
Always start with the AST. We want operands that can be function calls:
In ast.hpp:
struct function_call; // ADDED LINE
// ...
struct operand :
x3::variant<
nil
, unsigned int
, variable
, x3::forward_ast<unary>
, x3::forward_ast<expression>
, x3::forward_ast<function_call> // ADDED LINE
>
{
using base_type::base_type;
using base_type::operator=;
};
// ...
enum funtoken
{
fun_min,
fun_max,
};
// ...
struct function_call : x3::position_tagged
{
funtoken fun;
std::list<operand> args;
};
In ast_adapted.hpp:
BOOST_FUSION_ADAPT_STRUCT(client::ast::function_call,
fun, args
)
(This is all in expression_def.hpp
)
Let's be generic, so parse function name tokens using a symbol table:
x3::symbols<ast::funtoken> functions;
Which we have to initialize in add_keywords
:
functions.add
("min", ast::fun_min)
("max", ast::fun_max)
;
Now declare a rule for function calls:
struct function_call_class;
typedef x3::rule<function_call_class, ast::function_call> function_call_type;
function_call_type const function_call = "function_call";
That's all red-tape. The "interesting thing" is the rule definition:
auto const function_call_def =
functions
>> '(' >> expression % ',' >> ')'
;
Well. That's underwhelming. Let's integrate into our primary expression rule:
auto const primary_expr_def =
uint_
| bool_
| function_call
| (!keywords >> identifier)
| ('(' > expression > ')')
;
Note the ordering. If you want to be able to add function names that collide with a keyword, you'll need to add precautions.
Also, lets make AST annotation work for our node:
struct function_call_class : x3::annotate_on_success {};
It's easy to find where to add support for the new AST node:
In compiler.hpp:
bool operator()(ast::function_call const& x) const;
Now comes the hard part.
What's really required for general n-ary is an accumulator. Since we don't have registers, this would need to be a temporary (local). However, since the VM implementation doesn't have these, I've limited the implementation to a fixed binary function call only.
Note that the VM already has support for function calls. Functions can have locals. So, if you code-gen a variable-argument built-in function you can implement a left-fold recursive solution.
In compiler.cpp:
bool compiler::operator()(ast::function_call const& x) const
{
auto choice = [&](int opcode) {
BOOST_ASSERT(x.args.size() == 2); // TODO FIXME hardcoded binary builtin
auto it = x.args.begin();
auto& a = *it++;
if (!boost::apply_visitor(*this, a))
return false;
auto& b = *it++;
if (!boost::apply_visitor(*this, b))
return false;
program.op(opcode); // the binary fold operation
program.op(op_jump_if, 0);
size_t const branch = program.size()-1;
if (!boost::apply_visitor(*this, a))
return false;
program.op(op_jump, 0);
std::size_t continue_ = program.size()-1;
program[branch] = int(program.size()-branch);
if (!boost::apply_visitor(*this, b))
return false;
program[continue_] = int(program.size()-continue_);
return true;
};
switch (x.fun) {
case ast::fun_min: return choice(op_lt);
case ast::fun_max: return choice(op_gt);
default: BOOST_ASSERT(0); return false;
}
return true;
}
I've just taken inspiration from the surrounding code on how to generate the jump labels.
A simplistic example would be: var x = min(1,3);
Assembler----------------
local x, @0
start:
op_stk_adj 1
op_int 1
op_int 3
op_lt
op_jump_if 13
op_int 1
op_jump 15
13:
op_int 3
15:
op_store x
end:
-------------------------
Results------------------
x: 1
-------------------------
Running it with some random contrived input:
./test <<< "var a=$(($RANDOM % 100)); var
b=$(($RANDOM % 100)); var contrived=min(max(27,2*a), 100+b);"
Prints e.g.:
Assembler----------------
local a, @0
local b, @1
local contrived, @2
start:
op_stk_adj 3
op_int 31
op_store a
op_int 71
op_store b
op_int 27
op_int 2
op_load a
op_mul
op_gt
op_jump_if 24
op_int 27
op_jump 29
24:
op_int 2
op_load a
op_mul
29:
op_int 100
op_load b
op_add
op_lt
op_jump_if 58
op_int 27
op_int 2
op_load a
op_mul
op_gt
op_jump_if 51
op_int 27
op_jump 56
51:
op_int 2
op_load a
op_mul
56:
op_jump 63
58:
op_int 100
op_load b
op_add
63:
op_store contrived
end:
-------------------------
Results------------------
a: 31
b: 71
contrived: 62
-------------------------