Search code examples
c++boostboost-spiritboost-spirit-qi

Boost Qi Composing rules using Functions


I'm trying to define some Boost::spirit::qi parsers for multiple subsets of a language with minimal code duplication. To do this, I created a few basic rule building functions. The original parser works fine, but once I started to use the composing functions, my parsers no longer seem to work.

The general language is of the form:

A B: C

There are subsets of the language where A, B, or C must be specific types, such as A is an int while B and C are floats. Here is the parser I used for that sub language:

using entry = boost::tuple<int, float, float>;

template <typename Iterator>
struct sublang : grammar<Iterator, entry(), ascii::space_type>
{
   sublang() : sublang::base_type(start)
   {
       start = int_ >> float_ >> ':' >> float_;
   }
   rule<Iterator, entry(), ascii::space_type> start;
};

But since there are many subsets, I tried to create a function to build my parser rules:

template<typename AttrName, typename Value>
auto attribute(AttrName attrName, Value value)
{
    return attrName >> ':' >> value;
}

So that I could build parsers for each subset more easily without duplicate information:

// in sublang
start = int_ >> attribute(float_, float_);

This fails however and I'm not sure why. In my clang testing, parsing just fails. In g++, it seems the program crashes.

Here's the full example code: http://coliru.stacked-crooked.com/a/8636f19b2e9bff8d

What is wrong with the current code and what would be the correct approach for this problem? I would like to avoid specifying the grammar of attributes and other elements in each sublanguage parser.


Solution

  • Quite simply: using auto with Spirit (or any EDSL based on Boost Proto and Boost Phoenix) is most likely Undefined Behaviour¹

    Now, you can usually fix this using

    • BOOST_SPIRIT_AUTO
    • boost::proto::deep_copy
    • the new facility that's coming in the most recent version of Boost (TODO add link)

    In this case,

    template<typename AttrName, typename Value>
    auto attribute(AttrName attrName, Value value) {
        return boost::proto::deep_copy(attrName >> ':' >> value);
    }
    

    fixes it: Live On Coliru

    Alternatively

    1. you could use qi::lazy[] with inherited attributes.

      I do very similar things in the prop_key rule in Reading JSON file with C++ and BOOST.

    2. you could have a look at the Keyword List Operator from the Spirit Repository. It's designed to allow easier construction of grammars like:

      no_constraint_person_rule %=
          kwd("name")['=' > parse_string ]
        / kwd("age")   ['=' > int_]
        / kwd("size")   ['=' > double_ > 'm']
        ;
      
    3. This you could potentially combine with the Nabialek Trick. I'd search the answers on SO for examples. (One is Grammar balancing issue)


    ¹ Except for entirely stateless actors (Eric Niebler on this) and expression placeholders. See e.g.

    Some examples