Search code examples
c++boost-spiritboost-spirit-x3boost-fusion

Parsing Selector struct with alternating tokens using Boost Spirit X3


I am trying to parse the following struct:

struct Selector {
    std::string element;
    std::string id;
    std::vector<std::string> classes;
};

This struct is used to parse selectors in the form element#id.class1.class2.classn. These selectors always start with 1 or no elements, could contain 1 or no ids, and could contain 0 to n classes.

This gets even more complicated though, because classes and id can appear in any order, so the following selectors are all valid: element#id.class1, .class1#id.class2.class3, #id.class1.class2, .class1.class2#id. For this reason, I have not been able to use hold[], or at<T>() approaches described here, and I also have not been able to use BOOST_FUSION_ADAPT_STRUCT.

The only way that I have been able to synthesize this struct, is with the following rules:

auto element = [](auto& ctx){x3::_val(ctx).element = x3::_attr(ctx);};
auto id = [](auto& ctx){x3::_val(ctx).id = x3::_attr(ctx);};
auto empty = [](auto& ctx){x3::_val(ctx) = "";};
auto classes = [](auto& ctx){x3::_val(ctx).classes.insert(x3::_val(ctx).classes.end(), x3::_attr(ctx).begin(), x3::_attr(ctx).end());};

auto elementRule = x3::rule<class EmptyIdClass, std::string>() = +x3::char_("a-zA-Z") | x3::attr("");
auto idRule = x3::rule<class EmptyIdClass, std::string>() = ("#" >> +x3::char_("a-zA-Z")) | x3::attr("");
auto classesRule = x3::rule<class ClassesClass, std::vector<std::string>>() = *("." >> +x3::char_("a-zA-Z"));
auto selectorRule = x3::rule<class TestClass, Selector>() = elementRule[element] >> classesRule[classes] >> idRule[id] >> classesRule[classes];

What would be the best way to parse this struct? Is it possible to synthesize this selector struct naturally, using BOOST_FUSION_ADAPT_STRUCT, and without semantic actions?

It seems like everytime I think I am am getting the hang of Spirit X3, I stumble upon a new challenge. In this particular case, I learned about issues with backtracking, about an issue with using at<T>() that was introduced in Boost 1.70 here, and I also learned that hold[] is not supported by X3.


Solution

  • I've written similar answers before:

    I don't think you can directly fusion-adapt. Although if you are very motivated (e.g. you already have the adapted structs) you could make some generic helpers off that.

    To be fair, a little bit of restructuring in your code seems pretty nice to me, already. Here's my effort to make it more elegant/convenient. I'll introduce a helper macro just like BOOST_FUSION_ADAPT_XXX, but not requiring any Boost Fusion.

    Let's Start With The AST

    As always, I like to start with the basics. Understanding the goal is half the battle:

    namespace Ast {
        using boost::optional;
    
        struct Selector {
            // These selectors always 
            //  - start with 1 or no elements, 
            //  - could contain 1 or no ids, and
            //  - could contain 0 to n classes.
            optional<std::string> element;
            optional<std::string> id;
            std::vector<std::string> classes;
    
            friend std::ostream& operator<<(std::ostream& os, Selector const&s) {
                if  (s.element.has_value()) os << s.element.value();
                if  (s.id.has_value())      os << "#" << s.id.value();
                for (auto& c : s.classes)   os << "." << c;
                return os;
            }
        };
    }
    

    Note that I fixed the optionality of some parts to reflect real life.

    You could use this to detect repeat-initialization of element/id fields.

    Magic Sauce (see below)

    #include "propagate.hpp"
    DEF_PROPAGATOR(Selector, id, element, classes)
    

    We'll dig into this later. Suffice it to say it generates the semantic actions that you had to tediously write.

    Main dish

    Now, we can simplify the parser rules a lot, and run the tests:

    int main() {
        auto name        = as<std::string>[x3::alpha >> *x3::alnum];
        auto idRule      = "#" >> name;
        auto classesRule = +("." >> name);
    
        auto selectorRule
            = x3::rule<class TestClass, Ast::Selector>{"selectorRule"}
            = +( name        [ Selector.element ]
               | idRule      [ Selector.id ]
               | classesRule [ Selector.classes ]
               )
            ;
    
        for (std::string const& input : {
                "element#id.class1.class2.classn",
                "element#id.class1",
                ".class1#id.class2.class3",
                "#id.class1.class2",
                ".class1.class2#id",
            })
        {
            Ast::Selector sel;
            std::cout << std::quoted(input) << " -->\n";
            if (x3::parse(begin(input), end(input), selectorRule >> x3::eoi, sel)) {
                std::cout << "\tSuccess: " << sel << "\n";
            } else {
                std::cout << "\tFailed\n";
            }
        }
    }
    

    See it Live On Wandbox, printing:

    "element#id.class1.class2.classn" -->
        Success: element#id.class1.class2.classn
    "element#id.class1" -->
        Success: element#id.class1
    ".class1#id.class2.class3" -->
        Success: #id.class1.class2.class3
    "#id.class1.class2" -->
        Success: #id.class1.class2
    ".class1.class2#id" -->
        Success: #id.class1.class2
    

    The Magic

    Now, how did I generate those actions? Using a little bit of Boost Preprocessor:

    #define MEM_PROPAGATOR(_, T, member) \
        Propagators::Prop<decltype(std::mem_fn(&T::member))> member { std::mem_fn(&T::member) };
    
    #define DEF_PROPAGATOR(type, ...) \
        struct type##S { \
            using T = Ast::type; \
            BOOST_PP_SEQ_FOR_EACH(MEM_PROPAGATOR, T, BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__)) \
        } static const type {};
    

    Now, you might see that it defines static const variables named like the Ast types.

    You're free to call this macro in another namespace, say namespace Actions { }

    The real magic is Propagators::Prop<F> which has a bit of dispatch to allow for container attributes and members. Otherwise it just relays to x3::traits::move_to:

    namespace Propagators {
        template <typename F>
        struct Prop {
            F f;
            template <typename Ctx>
            auto operator()(Ctx& ctx) const {
                return dispatch(x3::_attr(ctx), f(x3::_val(ctx)));
            }
          private:
            template <typename Attr, typename Dest>
            static inline void dispatch(Attr& attr, Dest& dest) {
                call(attr, dest, is_container(attr), is_container(dest));
            }
    
            template <typename T>
            static auto is_container(T const&)           { return x3::traits::is_container<T>{}; }
            static auto is_container(std::string const&) { return boost::mpl::false_{}; }
    
            // tags for dispatch
            using attr_is_container = boost::mpl::true_;
            using attr_is_scalar    = boost::mpl::false_;
            using dest_is_container = boost::mpl::true_;
            using dest_is_scalar    = boost::mpl::false_;
    
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_scalar, dest_is_scalar) {
                x3::traits::move_to(attr, dest);
            }
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_scalar, dest_is_container) {
                dest.insert(dest.end(), attr);
            }
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest, attr_is_container, dest_is_container) {
                dest.insert(dest.end(), attr.begin(), attr.end());
            }
        };
    }
    

    BONUS

    A lot of the complexity in the propagator type is from handling container attributes. However, you don't actually need any of that:

    auto name = as<std::string>[x3::alpha >> *x3::alnum];
    
    auto selectorRule
        = x3::rule<class selector_, Ast::Selector>{"selectorRule"}
        = +( name        [ Selector.element ]
           | '#' >> name [ Selector.id ]
           | '.' >> name [ Selector.classes ]
           )
        ;
    

    Is more than enough, and the propagation helper can be simplified to:

    namespace Propagators {
        template <typename F> struct Prop {
            F f;
            template <typename Ctx>
            auto operator()(Ctx& ctx) const {
                return call(x3::_attr(ctx), f(x3::_val(ctx)));
            }
          private:
            template <typename Attr, typename Dest>
            static inline void call(Attr& attr, Dest& dest) {
                x3::traits::move_to(attr, dest);
            }
            template <typename Attr, typename Elem>
            static inline void call(Attr& attr, std::vector<Elem>& dest) {
                dest.insert(dest.end(), attr);
            }
        };
    }
    

    As you can see evaporating the tag dispatch has a beneficial effect.

    See the simplified version Live On Wandbox again.

    FULL LISTING

    For posterity on this site:

    • test.cpp

      //#define BOOST_SPIRIT_X3_DEBUG
      #include <boost/spirit/home/x3.hpp>
      #include <iostream>
      #include <iomanip>
      
      namespace x3 = boost::spirit::x3;
      
      namespace Ast {
          using boost::optional;
      
          struct Selector {
              // These selectors always 
              //  - start with 1 or no elements, 
              //  - could contain 1 or no ids, and
              //  - could contain 0 to n classes.
              optional<std::string> element;
              optional<std::string> id;
              std::vector<std::string> classes;
      
              friend std::ostream& operator<<(std::ostream& os, Selector const&s) {
                  if  (s.element.has_value()) os << s.element.value();
                  if  (s.id.has_value())      os << "#" << s.id.value();
                  for (auto& c : s.classes)   os << "." << c;
                  return os;
              }
          };
      }
      
      #include "propagate.hpp"
      DEF_PROPAGATOR(Selector, id, element, classes)
      
      #include "as.hpp"
      int main() {
          auto name = as<std::string>[x3::alpha >> *x3::alnum];
      
          auto selectorRule
              = x3::rule<class selector_, Ast::Selector>{"selectorRule"}
              = +( name        [ Selector.element ]
                 | '#' >> name [ Selector.id ]
                 | '.' >> name [ Selector.classes ]
                 )
              ;
      
          for (std::string const& input : {
                  "element#id.class1.class2.classn",
                  "element#id.class1",
                  ".class1#id.class2.class3",
                  "#id.class1.class2",
                  ".class1.class2#id",
              })
          {
              Ast::Selector sel;
              std::cout << std::quoted(input) << " -->\n";
              if (x3::parse(begin(input), end(input), selectorRule >> x3::eoi, sel)) {
                  std::cout << "\tSuccess: " << sel << "\n";
              } else {
                  std::cout << "\tFailed\n";
              }
          }
      }
      
    • propagate.hpp

      #pragma once
      #include <boost/preprocessor/cat.hpp>
      #include <boost/preprocessor/seq/for_each.hpp>
      #include <functional>
      
      namespace Propagators {
          template <typename F> struct Prop {
              F f;
              template <typename Ctx>
              auto operator()(Ctx& ctx) const {
                  return call(x3::_attr(ctx), f(x3::_val(ctx)));
              }
            private:
              template <typename Attr, typename Dest>
              static inline void call(Attr& attr, Dest& dest) {
                  x3::traits::move_to(attr, dest);
              }
              template <typename Attr, typename Elem>
              static inline void call(Attr& attr, std::vector<Elem>& dest) {
                  dest.insert(dest.end(), attr);
              }
          };
      }
      
      #define MEM_PROPAGATOR(_, T, member) \
          Propagators::Prop<decltype(std::mem_fn(&T::member))> member { std::mem_fn(&T::member) };
      
      #define DEF_PROPAGATOR(type, ...) \
          struct type##S { \
              using T = Ast::type; \
              BOOST_PP_SEQ_FOR_EACH(MEM_PROPAGATOR, T, BOOST_PP_VARIADIC_TO_SEQ(__VA_ARGS__)) \
          } static const type {};
      
    • as.hpp

      #pragma once
      #include <boost/spirit/home/x3.hpp>
      
      namespace {
          template <typename T>
          struct as_type {
              template <typename...> struct tag{};
              template <typename P>
              auto operator[](P p) const {
                  return boost::spirit::x3::rule<tag<T,P>, T> {"as"}
                         = p;
              }
          };
      
          template <typename T>
              static inline const as_type<T> as = {};
      }