Search code examples
c++boostboost-spiritboost-spirit-qiboost-spirit-x3

Statefulness of Spirit V2 and X3


What is the intent of Spirit X3 being so much 'stateless'?

Bad aspects of 'states' in Spirit V2

Looking back to Spirit V2, the "grammar" was, say, conceptually stateful - in many ways. This is because the grammar was a class instance.

Basically, there are lots of bad aspects for making your grammar -- or even any single rule -- to be stateful:

  • It might make your grammar non-re-entrant;
  • It might add thread-unsafety to your grammar instance;
  • Self-managed 'flag' is a disaster.

Theoretically speaking, adding an external state makes your grammar non-trivial.

Really need no state?

In contrast, you can say any parser is stateful (because it parses the current context and context is the state). Below is a good case of additional 'context' added by a programmer:

quoted_string_ = as_string [omit [char_("'\"") [_a = _1]] >> *(char_ - lit(_a)) >> lit(_a)]

qi::locals was a good sign of non-external states.

There were also 'external states' which a programmer could add to his grammar, and they were just doing something wrong in most cases:

func_call_ = func_name_ >> lit('(') >> eps [ref(is_inside_function_call) = true] >> ...

But still, there were some corner cases where external states being useful.

macro_type_1_ =
    lit("{{{") [PUSH_STATE(macro_ctx, Macro::Type1)] >> (
        ((any_expr_ - end_of_macro_ctx_) >> lit("}}}") >> eps [POP_STATE(macro_ctx)]) |
        (eps [POP_STATE(macro_ctx)] >> eps [_pass = false])
    )
;
macro_type_2_ =
    lit("[[[") [PUSH_STATE(macro_ctx, Macro::Type2)] >> (
        ((any_expr_ - end_of_macro_ctx_) >> lit("]]]") >> eps [POP_STATE(macro_ctx)]) |
        (eps [POP_STATE(macro_ctx)] >> eps [_pass = false])
    )
;

Above is an example of some arbitrary context-sensitive language. Here I am adding a 'context stack' by emulating a 'destructor' for the sub rule. This might be a good case of using a special variant of Nabialec Trick where end_of_macro_ctx_ being qi::symbols instance.

(See Boost.Spirit.Qi: dynamically create "difference" parser at parse time for possible implementation detail)

You can't use qi::locals here, because there are no guarantee for the lifetime of qi::locals. So you should use a global variable (i.e. member variable for your grammar class instance).

Inherited attributes? Maybe. If you are willing to pass the same variable to every single rule.

External states for grammar itself

Speaking about external states, there are even more fundamental stuff which a programmer might want to add to his grammar.

on_error<fail>(root_, phx::bind(&my_logger, &MyLogger::error, _1, _2, _3, _4));

You can't do this anymore on X3.

Statelessness of X3

X3 is expecting an user to define his every single rule in namespace scope, with auto-consted instance.

Okay, now let's take a look at the implementation of BOOST_SPIRIT_DEFINE. It is basically doing only one thing:

#define BOOST_SPIRIT_DEFINE(your_rule, <unspecified>) template <unspecified> <unspecified> parse_rule(decltype(your_rule), <unspecified>...) { <unspecified> }

The first argument of parse_rule() is decltype-d to given rule's unique type.

This means two things:

  1. X3 is fully relying on ADL call to parse_rule().
  2. parse_rule() must be defined in namespace scope.

You can't specialize a template function for an instance. There's no way of telling X3 to use any of my instance variables.

I lied. You can do this if you want:

static inline MyLogger& use_my_logger_please() {
    static MyLogger instance; return instance;
}

or

#define MY_BOOST_SPIRIT_DEFINE(my_rule, <unspecified>, my_logger_f) <unspecified>
MY_BOOST_SPIRIT_DEFINE(rule_1_, ..., std::bind([] (MyLogger& l, std::string const& msg) { l << msg; }, this->logger_instance_, std::placeholders::_1))

Really?


Solution

  • You make a number of unsubstantiated claims in your "question" article.

    I recognize much of the sentiment that shines through your rant, but I find it hard to constructively respond when there is so much debatable in it.

    New Possibilities

    X3 is expecting an user to define his every single rule in namespace scope, with auto-consted instance.

    This is simply not true. X3 doesn't do that. It could be said that X3 promotes that pattern to enable key features like

    • recursive grammars
    • separation of parsers across translation units

    On the flip side, there's not always a need for any of that.

    The very value-orientedness of X3 enables new patterns to achieve things. I'm quite fond of being able to do things like:

    Stateful Parser Factories

    auto make_parser(char delim) {
         return lexeme [ delim >> *('\\' >> char_ | ~char_(delim)) >> delim ];
    }
    

    Indeed, you might "need" x3::rule to achieve attribute coercion (like qi::transfom_attr):

    auto make_parser(char delim) {
         return rule<struct _, std::string> {} = lexeme [ delim >> *('\\' >> char_ | ~char_(delim)) >> delim ];
    }
    

    In fact, I've used this pattern to make quick-and-dirty as<T>[] directive: Understanding the List Operator (%) in Boost.Spirit.

    auto make_parser(char delim) {
         return as<std::string> [ lexeme [ delim >> *('\\' >> char_ | ~char_(delim)) >> delim ] ];
    }
    

    Nothing prevents you from using a dynamic parser factory like that to use context from surrounding state.

    Stateful Semantic Actions

    Semantic actions are copied by value, but they can freely refer to external state. When using factory functions, they can, again, use surrounding state.

    Stateful directives

    The only way directives to create state on the fly is to extend the actual context object. The x3::with<> directive supports this, e.g. Boost Spirit X3 cannot compile repeat directive with variable factor

    This can be used to pigeon-hole unlimited amounts of state, e.g. by just side-channel passing a (smart) pointer/reference to your parser state.

    Custom Parsers

    Custom parsers are a surprisingly simple way to get a lot of power in X3. See for an example:

    Spirit-Qi: How can I write a nonterminal parser?

    I personally think custom parsers are more elegant than anything like the BOOST_SPIRIT_DECLARE/_DEFINE/_INSTANTIATE dance. I admit I've never created anything requiring multi-TU parsers in pure X3 yet (I tend to use X3 for small, independent parsers), but I intuitively prefer building my own TU-separation logic building from x3::parser_base over the "blessed" macros mentioned above. See also this discussion: Design/structure X3 parser more like Qi parser

    Error/success handling

    The compiler tutorials show how to trigger handlers for specific rules using a marker base-class for the rule tag type. I've one day figured out the mechanics, but sadly I don't remember all the details and LiveCoding.tv seems to have lost my live-stream on the topic.

    I encourage you to look at the compiler samples (they're in the source tree only).

    Summarizing

    I can see how you notice negative differences. It's important to realize that X3 is less mature, aims to be more light-weight, so some things are simply not implemented. Also note that X3 enables many things in more elegant ways than previously possible. The fact that most things interact more naturally with c++14 core language features is a big boon.

    If you want read more about what things disappoint me about X3, see the introductory discussion in that linked answer, some discussions in chat (like this one).

    I hope my counter rant helps you in journey learning X3. I tried to substantiate as many things as I could, though I freely admit I sometimes still prefer Qi.