Search code examples
c++boostboost-spiritboost-spirit-qiboost-phoenix

Boost.Spirit.Qi: dynamically create "difference" parser at parse time


A "difference" parser can be created by the binary -(minus) operator:

rule = qi::char_ - qi::lit("}}")

or even compound differences:

rule = qi::char_ - qi::lit("}}") - qi::lit("]]")

But how could I generate the whole result of the difference parser at the parse time?
I'm guessing it might be some kind of form like below:

phoenix::function<difference_parser_impl> difference_parser;
rule = qi::lazy(difference_parser(qi::char_, {"}}", "]]"}));

Here, the {..., ..., ...} part would actually be a stl container, but it is not the point; I can handle that part.

I have found the template qi::difference<Left, Right> -- but I couldn't find out how to use it.


Solution

  • It seems to me you're not looking for a dynamic "difference" expression so much, but rather a dynamic "variadic alternative (a|b|c...)" expression:

    expr - a - b - c is equivalent to expr - (a|b|c)

    You could then easily achieve the difference using either:

    expr - orCombine(alternatives)
    

    or

    !orCombine(alternatives) >> expr
    

    Now, getting this done has many rough edges, which I'll explain first. Luckily, there is a simpler way, using qi::symbols, which I'll demonstrate right after that.

    The tricky stuff

    If you want, you can "generate" alternative parser expressions on-demand, with a fair bit of wizardry. I showed how to do this in this answer:

    But

    1. it is fraught with pitfalls (as proto expressions don't lend themselves to copying well)1
    2. it conveniently used variadics in order to avoid intermediate storage (note the deepcopy_ to ward of Undefined Behaviour):

      template<typename ...Expr>
      void parse_one_of(Expr& ...expressions)
      {
          auto parser = boost::fusion::fold(
                      boost::tie(expressions...),
                      qi::eps(false),
                      deepcopy_(arg2 | arg1)
                  );
      

      Seeing how you have a need for truly dynamic composition of the alternative parser, I don't see how this could be adapted to your needs without an explosion of complexity and opportunity for subtle error (believe me, I already tried).

    So, instead I recommend a tried & true approach that "abuses" an existing "dynamic" parser:

    Simplify using qi::symbols

    This idea borrows losely from the well-famed "Nabialek Trick". It uses qi::symbols, and consequently has excellent runtime performance characteristics2.

    With no further ado, this is an example of how you could use it, starting from a vector of string literals:

    template <typename It, typename Skipper = qi::space_type>
        struct parser : qi::grammar<It, std::string(), Skipper>
    {
        parser() : parser::base_type(start)
        {
            static const std::vector<std::string> not_accepted { "}}", "]]" };
    
            using namespace qi;
            exclude = exclusions(not_accepted);
            start = *(char_ - exclude);
    
            BOOST_SPIRIT_DEBUG_NODE(start);
        }
    
      private:
        qi::rule<It, std::string(), Skipper> start;
    
        typedef qi::symbols<char, qi::unused_type> Exclude;
        Exclude exclude;
    
        template<typename Elements>
        Exclude exclusions(Elements const& elements) {
            Exclude result;
    
            for(auto& el : elements)
                result.add(el);
    
            return result;
        }
    };
    

    A full working sample of this is here: http://coliru.stacked-crooked.com/view?id=ddbb2549674bfed90e3c8df33b048574-7616891f9fd25da6391c2728423de797 and it prints

    parse success
    data: 123
    trailing unparsed: ']] 4'
    

    Full code

    For future reference:

    #include <boost/spirit/include/qi.hpp>
    
    namespace qi    = boost::spirit::qi;
    
    template <typename It, typename Skipper = qi::space_type>
        struct parser : qi::grammar<It, std::string(), Skipper>
    {
        parser() : parser::base_type(start)
        {
            static const std::vector<std::string> not_accepted { "}}", "]]" };
    
            using namespace qi;
            exclude = exclusions(not_accepted);
            start = *(char_ - exclude);
    
            BOOST_SPIRIT_DEBUG_NODE(start);
        }
    
      private:
        qi::rule<It, std::string(), Skipper> start;
    
        typedef qi::symbols<char, qi::unused_type> Exclude;
        Exclude exclude;
    
        template<typename Elements>
        Exclude exclusions(Elements const& elements) {
            Exclude result;
    
            for(auto& el : elements)
                result.add(el);
    
            return result;
        }
    };
    
    int main()
    {
        const std::string input = "1 2 3]] 4";
        typedef std::string::const_iterator It;
        It f(begin(input)), l(end(input));
    
        parser<It> p;
        std::string data;
    
        bool ok = qi::phrase_parse(f,l,p,qi::space,data);
        if (ok)   
        {
            std::cout << "parse success\n";
            std::cout << "data: " << data << "\n";
        }
        else std::cerr << "parse failed: '" << std::string(f,l) << "'\n";
    
        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
    }
    

    1 I believe this problem is about to be removed in the upcoming new version of Spirit (currently dubbed "Spirit X3" for the experimental version)

    2 It uses Tries to lookup the matches