Search code examples
c++boostboost-spiritboost-spirit-x3

Skippers in Boost.Spirit.X3


I'm trying to write a parser for the language with a little bit weird syntax and stumbled upon a problem with skippers which makes me think that I do not fully understand how they work in Boost.Spirit.X3.

The problem is that for some rules EOLs are meaningful (i.e. I have to match the end of the line to be sure the statement is correct), while for others they are not (thus it can be skipped).

As a result, I decided to use the following definition of the skipper for my root rule:

namespace x3 = boost::spirit::x3;
namespace ch = x3::standard;

using ch::blank;
using x3::eol;

auto const skipper = comment | blank;

where comment just skips comments obviously. In other words, I preserve EOLs in the input stream.

Now, for another rule, I'd like to use the definition like this:

auto const writable_property_declaration_def =
    skip(skipper | eol)
    [
        lit("#")
        > property_type
        > property_id
    ];

The rule itself is a part of one more another rule which is instantiated as following:

BOOST_SPIRIT_INSTANTIATE(property_declaration_type, iterator_type, context_type);

where

using skipper_type = decltype(skipper);

using iterator_type = std::string::const_iterator;
using phrase_context_type = x3::phrase_parse_context<skipper_type>::type;
using error_handler_type = x3::error_handler<iterator_type>;
using context_type = x3::context<x3::error_handler_tag, std::reference_wrapper<error_handler_type>, phrase_context_type>;

And that seems to not work: the EOLs are not skipped.

Now, my questions are the following:

  • What's the connection between boost::spirit::x3::phrase_parse_context and the particular skipper I use?
  • And how does skip(p)[a] actually work?
  • Is it possible to somehow define the underlying rule in such a way that it uses another skipper so that the X3 handles all the EOLs on its own and I don't need to do it manually?

Looking forward to your reply(-ies)! :)


Solution

  • You didn't actually show all declarations, so it's not completely clear how the setup is. So let me mock up something quick:

    Live On Wandbox

    #define BOOST_SPIRIT_X3_DEBUG
    #include <iomanip>
    #include <boost/spirit/home/x3.hpp>
    
    namespace x3 = boost::spirit::x3;
    namespace P {
        using namespace x3;
        static auto const comment = lexeme [ 
                "/*" >> *(char_ - "*/") >> "*/"
              | "//" >> *~char_("\r\n") >> eol
            ];
    
        static auto const skipper = comment | blank;
    
        static auto const property_type = lexeme["type"];
        static auto const property_id = lexeme["id"];
    
        auto const demo =
            skip(skipper | eol) [
                lit("#")
                > property_type
                > property_id
            ];
    }
    
    int main() {
        for (std::string const input : {
                "#type id",
                "#type\nid",
            })
        {
            std::cout << "==== " << std::quoted(input) << " ====" << std::endl;
            auto f = begin(input), l = end(input);
            if (parse(f, l, P::demo)) {
                std::cout << "Parsed successfully" << std::endl;
            } else {
                std::cout << "Failed" << std::endl;
            }
    
            if (f!=l) {
                std::cout << "Remaining input unparsed: " << std::quoted(std::string(f,l)) << std::endl;
            }
        }
    }
    

    As you can see there's not actually a problem unless the rule declarations get involved:

    ==== "#type id" ====
    Parsed successfully
    ==== "#type
    id" ====
    Parsed successfully
    

    Let's zoom in from here

    static auto const demo_def =
        skip(skipper | eol) [
            lit("#")
            > property_type
            > property_id
        ];
    
    static auto const demo = x3::rule<struct demo_> {"demo"} = demo_def;
    

    Still OK: Live On Wandbox

    <demo>
      <try>#type id</try>
      <success></success>
    </demo>
    <demo>
      <try>#type\nid</try>
      <success></success>
    </demo>
    Parsed successfully
    ==== "#type
    id" ====
    Parsed successfully
    

    So, we know that x3::rule<> is not actually the issue. It's gonna be about the static dispatch based on the tag type (aka rule ID, I think, in this case struct demo_).

    Doing the straight-forward:

    static auto const demo_def =
        skip(skipper | eol) [
            lit("#")
            > property_type
            > property_id
        ];
    
    static auto const demo = x3::rule<struct demo_> {"demo"};
    
    BOOST_SPIRIT_DEFINE(demo)
    

    Still OK: Live On Wandbox

    Hmm what else could be wrong. Maybe if there are conflicing skipper contexts? Replacing

        if (parse(f, l, P::demo)) {
    

    with

        if (phrase_parse(f, l, P::demo, P::skipper)) {
    

    Still OK: Live On Wandbox

    So, that's not it either. Ok, let's try the separate instantiation:

    Separate Compilation

    Live On Wandbox

    • rule.h

      #pragma once
      #define BOOST_SPIRIT_X3_DEBUG
      #include <boost/spirit/home/x3.hpp>
      #include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
      
      namespace x3 = boost::spirit::x3;
      namespace P {
          using namespace x3;
          static auto const comment = lexeme [ 
                  "/*" >> *(char_ - "*/") >> "*/"
                | "//" >> *~char_("\r\n") >> eol
              ];
      
          static auto const skipper = comment | blank;
      
          using demo_type = x3::rule<struct demo_>;
          extern demo_type const demo;
      
          BOOST_SPIRIT_DECLARE(demo_type)
      }
      
    • rule.cpp

      #include "rule.h"
      #include <iostream>
      #include <iomanip>
      
      namespace P {
          using namespace x3;
      
          static auto const property_type = lexeme["type"];
          static auto const property_id = lexeme["id"];
      
          static auto const demo_def =
              skip(skipper | eol) [
                  lit("#")
                  > property_type
                  > property_id
              ];
      
          struct demo_ {
              template<typename It, typename Ctx>
                  x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
                      std::string s(f,l);
                      auto pos = std::distance(f, ef.where());
      
                      std::cout << "Expecting " << ef.which() << " at "
                          << "\n\t" << s
                          << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
      
                      return error_handler_result::fail;
                  }
          };
      
          demo_type const demo {"demo"};
          BOOST_SPIRIT_DEFINE(demo)
      
          // for non-skipper invocation (x3::parse)
          using iterator_type = std::string::const_iterator;
          BOOST_SPIRIT_INSTANTIATE(demo_type, iterator_type, x3::unused_type)
      
          // for skipper invocation (x3::phrase_parse)
          using skipper_type = decltype(skipper);
          using phrase_context_type = x3::phrase_parse_context<skipper_type>::type;
          BOOST_SPIRIT_INSTANTIATE(demo_type, iterator_type, phrase_context_type)
      }
      
    • test.cpp

      #include "rule.h"
      #include <iostream>
      #include <iomanip>
      
      int main() {
          std::cout << std::boolalpha;
          for (std::string const input : {
                  "#type id",
                  "#type\nid",
              })
          {
              std::cout << "\n==== " << std::quoted(input) << " ====" << std::endl;
      
              {
                  auto f = begin(input), l = end(input);
                  std::cout << "With top-level skipper: " << phrase_parse(f, l, P::demo, P::skipper) << std::endl;
      
                  if (f!=l) {
                      std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << std::endl;
                  }
              }
              {
                  auto f = begin(input), l = end(input);
                  std::cout << "Without top-level skipper: " << parse(f, l, P::demo) << std::endl;
      
                  if (f!=l) {
                      std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << std::endl;
                  }
              }
          }
      }
      

    Prints the expected:

    ==== "#type id" ====
    With top-level skipper: <demo>
      <try>#type id</try>
      <success></success>
    </demo>
    true
    Without top-level skipper: <demo>
      <try>#type id</try>
      <success></success>
    </demo>
    true
    
    ==== "#type
    id" ====
    With top-level skipper: <demo>
      <try>#type\nid</try>
      <success></success>
    </demo>
    true
    Without top-level skipper: <demo>
      <try>#type\nid</try>
      <success></success>
    </demo>
    true
    

    Or, without debug enabled:

    ==== "#type id" ====
    With top-level skipper: true
    Without top-level skipper: true
    
    ==== "#type
    id" ====
    With top-level skipper: true
    Without top-level skipper: true
    

    FINAL THOUGHTS

    Sadly, perhaps, I cannot reproduce the symptom you describe. However, I hope some of the steps above do clarify how separate linkage of rule-definition actually work with respect to the skipper/contexts.

    If your situation is actually more complicated, I can only think of another situation where the X3 situation may be different from the QI situation. In Qi, a rule statically declared its skipper. In X3, the skipper is strictly from context (and the only way a rule can limit the number of supported skippers is by separating instantiation and hiding the definition in a separate TU).

    This means that it is easy to accidentally inherit an overridden skipper. This can be counter-intuitive in e.g. nested rules. I'd suggest not relying on inherited skipper contexts at all if you have different skippers.