I'm trying to define some Boost::spirit::qi parsers for multiple subsets of a language with minimal code duplication. To do this, I created a few basic rule building functions. The original parser works fine, but once I started to use the composing functions, my parsers no longer seem to work.
The general language is of the form:
A B: C
There are subsets of the language where A
, B
, or C
must be specific types, such as A
is an int while B
and C
are floats. Here is the parser I used for that sub language:
using entry = boost::tuple<int, float, float>;
template <typename Iterator>
struct sublang : grammar<Iterator, entry(), ascii::space_type>
{
sublang() : sublang::base_type(start)
{
start = int_ >> float_ >> ':' >> float_;
}
rule<Iterator, entry(), ascii::space_type> start;
};
But since there are many subsets, I tried to create a function to build my parser rules:
template<typename AttrName, typename Value>
auto attribute(AttrName attrName, Value value)
{
return attrName >> ':' >> value;
}
So that I could build parsers for each subset more easily without duplicate information:
// in sublang
start = int_ >> attribute(float_, float_);
This fails however and I'm not sure why. In my clang testing, parsing just fails. In g++, it seems the program crashes.
Here's the full example code: http://coliru.stacked-crooked.com/a/8636f19b2e9bff8d
What is wrong with the current code and what would be the correct approach for this problem? I would like to avoid specifying the grammar of attributes and other elements in each sublanguage parser.
Quite simply: using auto
with Spirit (or any EDSL based on Boost Proto and Boost Phoenix) is most likely Undefined Behaviour¹
Now, you can usually fix this using
boost::proto::deep_copy
In this case,
template<typename AttrName, typename Value>
auto attribute(AttrName attrName, Value value) {
return boost::proto::deep_copy(attrName >> ':' >> value);
}
fixes it: Live On Coliru
you could use qi::lazy[]
with inherited attributes.
I do very similar things in the prop_key
rule in Reading JSON file with C++ and BOOST.
you could have a look at the Keyword List Operator from the Spirit Repository. It's designed to allow easier construction of grammars like:
no_constraint_person_rule %=
kwd("name")['=' > parse_string ]
/ kwd("age") ['=' > int_]
/ kwd("size") ['=' > double_ > 'm']
;
This you could potentially combine with the Nabialek Trick. I'd search the answers on SO for examples. (One is Grammar balancing issue)
¹ Except for entirely stateless actors (Eric Niebler on this) and expression placeholders. See e.g.
Some examples