Search code examples
c++boost-spiritboost-spirit-x3

How do you get a string out of a Boost Spirit X3 lexeme parser?


What is the simplest way to make a semantic action that extracts a string from a typical identifier parser based on boost::spirit::x3::lexeme?

I thought it might be possible to bypass needing to unpack the attribute and just use iterators into the input stream but apparently x3::_where does not do what I think it does.

The following yields output being empty. I expected it to contain "foobar_hello".

namespace x3 = boost::spirit::x3;

using x3::_where;
using x3::lexeme;
using x3::alpha;

auto ctx_to_string = [&](auto& ctx) {
    _val(ctx) = std::string(_where(ctx).begin(), _where(ctx).end());
};

x3::rule<class identifier_rule_, std::string> const identifier_rule = "identifier_rule";
auto const identifier_rule_def = lexeme[(x3::alpha | '_') >> *(x3::alnum | '_')][ctx_to_string];
BOOST_SPIRIT_DEFINE(identifier_rule);

int main()
{
    std::string input = "foobar_hello";

    std::string output;
    auto result = x3::parse(input.begin(), input.end(), identifier_rule, output);
}

Do I need to somehow extract the string from the boost::fusion objects in x3::_attr(ctx) or am I doing something wrong?


Solution

  • You can simply use automatic attribute propagation, meaning you don't need the semantic action(1)

    Live On Coliru

    #include <iostream>
    #include <iomanip>
    #define BOOST_SPIRIT_X3_DEBUG
    #include <boost/spirit/home/x3.hpp>
    namespace x3 = boost::spirit::x3;
    
    namespace P {
        x3::rule<class identifier_rule_, std::string> const identifier_rule = "identifier_rule";
        auto const identifier_rule_def = x3::lexeme[(x3::alpha | x3::char_('_')) >> *(x3::alnum | x3::char_('_'))];
        BOOST_SPIRIT_DEFINE(identifier_rule)
    }
    
    int main() {
        std::string const input = "foobar_hello";
    
        std::string output;
        auto result = x3::parse(input.begin(), input.end(), P::identifier_rule, output);
    }
    

    Prints

    <identifier_rule>
      <try>foobar_hello</try>
      <success></success>
      <attributes>[f, o, o, b, a, r, _, h, e, l, l, o]</attributes>
    </identifier_rule>
    

    Note I changed '_' to x3::char_('_') to capture the underscores (x3::lit does not capture what it matches)

    If you insist on semantic actions,

    • consider using rule<..., std::string, true> to also force automatic attrobute propagation
    • don't assume _where points to what you hope: http://coliru.stacked-crooked.com/a/336c057dabc86a84
    • use x3::raw[] to expose a controlled source iterator range (http://coliru.stacked-crooked.com/a/80a69ae9b99a4c61)

      auto ctx_to_string = [](auto& ctx) {
          std::cout << "\nSA: '" << _attr(ctx) << "'" << std::endl;
          _val(ctx) = std::string(_attr(ctx).begin(), _attr(ctx).end());
      };
      
      x3::rule<class identifier_rule_, std::string> const identifier_rule = "identifier_rule";
      auto const identifier_rule_def = x3::raw[ lexeme[(x3::alpha | '_') >> *(x3::alnum | '_')] ] [ctx_to_string];
      BOOST_SPIRIT_DEFINE(identifier_rule)
      

      Note now the char_('_') doesn't make a difference anymore

    • consider using the built-in attribute helpers: http://coliru.stacked-crooked.com/a/3e3861330494e7c9

      auto ctx_to_string = [](auto& ctx) {
          using x3::traits::move_to;
          move_to(_attr(ctx), _val(ctx));
      };
      

      Note how this approximates the builtin attribute propagation, though it's much less flexible than letting Spirit manage it

    (1) mandatory link: Boost Spirit: "Semantic actions are evil"?