Search code examples
c++recursionboostboost-spiritboost-spirit-x3

Recursive x3 parser with results passing around


(1) Say we want to parse a simple recursive block surrounded by {}.

{
    Some text.
    {
        {
            Some more text.
        }
        Some Text again.
        {}
    }
}

This recursive parser is quite simple.

x3::rule<struct idBlock1> const ruleBlock1{"Block1"};
auto const ruleBlock1_def =
    x3::lit('{') >>
    *(
        ruleBlock1 |
        (x3::char_ - x3::lit('}'))
    ) >>
    x3::lit('}');

BOOST_SPIRIT_DEFINE(ruleBlock1)

(2) Then the block becomes more complex. It could also be surrounded by [].

{
    Some text.
    [
        {
            Some more text.
        }
        Some Text again.
        []
    ]
}

We need somewhere to store what kind of opening bracket that we have. Since x3 does not have locals, we may use attribute (x3::_val) instead.

x3::rule<struct idBlock2, char> const ruleBlock2{"Block2"};
auto const ruleBlock2_def = x3::rule<struct _, char>{} =
    (
        x3::lit('{')[([](auto& ctx){x3::_val(ctx)='}';})] |
        x3::lit('[')[([](auto& ctx){x3::_val(ctx)=']';})]
    ) >>
    *(
        ruleBlock2 |
        (
            x3::char_ - 
            (
                x3::eps[([](auto& ctx){x3::_pass(ctx)='}'==x3::_val(ctx);})] >> x3::lit('}') |
                x3::eps[([](auto& ctx){x3::_pass(ctx)=']'==x3::_val(ctx);})] >> x3::lit(']')
            )
        )
    ) >>
    (
        x3::eps[([](auto& ctx){x3::_pass(ctx)='}'==x3::_val(ctx);})] >> x3::lit('}') |
        x3::eps[([](auto& ctx){x3::_pass(ctx)=']'==x3::_val(ctx);})] >> x3::lit(']')
    );

BOOST_SPIRIT_DEFINE(ruleBlock2)

(3) The block content (surrounded part), we call it argument, may be much more complicated than this example. So we decide to create a rule for it. This attribute solution is not working in this case. Luckily we still have x3::with directive. We can save the open bracket (or expecting close bracket) in a stack reference and pass it to the next level.

struct SBlockEndTag {};
x3::rule<struct idBlockEnd> const ruleBlockEnd{"BlockEnd"};
x3::rule<struct idArg> const ruleArg{"Arg"};
x3::rule<struct idBlock3> const ruleBlock3{"Block3"};
auto const ruleBlockEnd_def =
    x3::eps[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::_pass(ctx)='}'==x3::get<SBlockEndTag>(ctx).get().top();
    })] >> 
    x3::lit('}') 
    |
    x3::eps[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::_pass(ctx)=']'==x3::get<SBlockEndTag>(ctx).get().top();
    })] >>
    x3::lit(']');
auto const ruleArg_def =
    *(
        ruleBlock3 |
        (x3::char_ - ruleBlockEnd)
    );
auto const ruleBlock3_def =
    (
        x3::lit('{')[([](auto& ctx){x3::get<SBlockEndTag>(ctx).get().push('}');})] |
        x3::lit('[')[([](auto& ctx){x3::get<SBlockEndTag>(ctx).get().push(']');})]
    ) >>
    ruleArg >>
    ruleBlockEnd[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::get<SBlockEndTag>(ctx).get().pop();
    })];

BOOST_SPIRIT_DEFINE(ruleBlockEnd, ruleArg, ruleBlock3)

The code is on Coliru.

Question: is this how we write recursive x3 parser for this kind of problem? With spirit Qi's locals and inherited attributes, the solution seems to be much simpler. Thanks.


Solution

  • You can use x3::with<>.

    However, I'd just write this:

    auto const block_def =
        '{' >> *( block  | (char_ - '}')) >> '}'
      | '[' >> *( block  | (char_ - ']')) >> ']';
    

    Demo

    Live On Coliru

    #include <boost/spirit/home/x3.hpp>
    #include <iostream>
    
    namespace Parser {
        using namespace boost::spirit::x3;
    
        rule<struct idBlock1> const block {"Block"};
        auto const block_def =
            '{' >> *( block  | (char_ - '}')) >> '}'
          | '[' >> *( block  | (char_ - ']')) >> ']';
    
        BOOST_SPIRIT_DEFINE(block)
    }
    
    int main() {
        std::string const input = R"({
        Some text.
        [
            {
                Some more text.
            }
            Some Text again.
            []
        ]
    })";
    
        std::cout << "Parsed: " << std::boolalpha << parse(input.begin(), input.end(), Parser::block) << "\n";
    }
    

    Prints:

    Parsed: true
    

    BUT - Code Duplication!

    If you insist on generalizing:

    auto dyna_block = [](auto open, auto close) {
        return open >> *(block | (char_ - close)) >> close;
    };
    
    auto const block_def =
        dyna_block('{', '}')
      | dyna_block('[', ']');