I'm trying to parse a lisp-like language which has some syntactic sugar for common functions. For example, the plus function can be written as (+ 1 2) or as 1 + 2. I'm thinking that eliminating syntactic sugar before trying to interpret the language would significantly facilitate the interpretation process, because then, the only evaluation rule that needs to be implemented is going to be that of a function call, and all infix operators would have to have corresponding built-in functions. I was thinking I could create a parser that will iterate over the tokens received from the lexer, and reorder them to put the expression into prefix notation. But this would require the output of the parser to also be a list of tokens. Is that possible in Spirit.Qi? As far as I understand, Spirit.Qi builds a hierarchical structure not a linear list of tokens.
Everything is, of course, possible.
That said, why don't you operate on the AST, and
Consider an AST representation like this (inspired on a kind of simplified s-expression):
typedef boost::make_recursive_variant<
std::string,
std::vector< boost::recursive_variant_ >
>::type expr_t;
You'd be able to define rules to accept either s_expr
or binobj
and parse into the same AST
qi::rule<It, std::vector<expr_t>(), Skipper> binop;
qi::rule<It, expr_t(), Skipper> expr, s_expr, name;
expr = binop | s_expr;
binop = (s_expr >> (string("+") | string("-") | string("*") | string("/")) >> expr)
[
phx::push_back(_val, _2),
phx::push_back(_val, _1),
phx::push_back(_val, _3)
];
name = as_string [ lexeme [ +(graph - char_("()") ) ] ];
s_expr = ('(' > +expr > ')') | name;
I have a full demo of this idea here: https://gist.github.com/3047788
For fun, I threw in
(1 + 3)
is parsed into the same syntax tree as (+ 1 3)
A demo main of test.cpp:
int main()
{
for (const auto input : std::list<std::string> {
"",
"(+ 1 3)",
"(1 + 3)",
"(1 + 4 / 2)",
"(1 + (* 4 5 6) / 2)",
"(avg 1 + (* 4 5 6) / 2)",
"(trace 1 + (* 4 5 6) / 2 1 1 1 1 999)",
"(avg 1 + (* 4 5 6) / 2 1 1 1 1 999)",
"(avg 1 + (* 4 (unknown-function 5) 6) / 2 a b c)",
})
{
std::cout << "----------------------\n";
expr_t ast = doParse(input, qi::space);
try
{
std::cout << "eval<double>: " << eval<double>(ast) << std::endl;
std::cout << "eval<int> : " << eval<int> (ast) << std::endl;
} catch(std::exception const& e)
{
std::cerr << e.what() << '\n';
}
}
}
Results in the following output:
----------------------
parse failed: ''
warning: undefined '<no ast>'
eval<double>: 0
warning: undefined '<no ast>'
eval<int> : 0
----------------------
parse success
input : (+ 1 3)
ast parsed : (+ 1 3)
eval<double>: 4
eval<int> : 4
----------------------
parse success
input : (1 + 3)
ast parsed : ((+ 1 3))
eval<double>: 4
eval<int> : 4
----------------------
parse success
input : (1 + 4 / 2)
ast parsed : ((+ 1 (/ 4 2)))
eval<double>: 3
eval<int> : 3
----------------------
parse success
input : (1 + (* 4 5 6) / 2)
ast parsed : ((+ 1 (/ (* 4 5 6) 2)))
eval<double>: 61
eval<int> : 61
----------------------
parse success
input : (avg 1 + (* 4 5 6) / 2)
ast parsed : (avg (+ 1 (/ (* 4 5 6) 2)))
eval<double>: 0
eval<int> : 0
----------------------
parse success
input : (trace 1 + (* 4 5 6) / 2 1 1 1 1 999)
ast parsed : (trace (+ 1 (/ (* 4 5 6) 2)) 1 1 1 1 999)
trace( 61 1 1 1 1 999 )
eval<double>: 0
trace( 61 1 1 1 1 999 )
eval<int> : 0
----------------------
parse success
input : (avg 1 + (* 4 5 6) / 2 1 1 1 1 999)
ast parsed : (avg (+ 1 (/ (* 4 5 6) 2)) 1 1 1 1 999)
eval<double>: 167.167
eval<int> : 167
----------------------
parse success
input : (avg 1 + (* 4 (unknown-function 5) 6) / 2 a b c)
ast parsed : (avg (+ 1 (/ (* 4 (unknown-function 5) 6) 2)) a b c)
error: can't apply 'unknown-function' to 1 args (std::exception)
1 bear in mind, there is no type system, which is why you have to specify the hardcoded datatype to interpret all values as (e.g. double result = eval<double>(ast)
). There is also no symbol table, so only builtin functions +-/*
trace
and avg
exist