Search code examples
c++boost-spiritboost-spirit-x3

Boost Spirit, obtain iterator inside semantic action


within a semantic action I want to get the iterator, preferably the entire iterator range from the first to last parsed character. When using the raw directive I could simply get it with _attr(context). I guessed that _where(context) does this, but it only returns an empty range whose begin iterator points to the character after the parsed substring.

Sample code:

#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <utility>

namespace x3 = boost::spirit::x3;

int main()
{
    const auto action = [](auto &ctx)
    {
        auto range = x3::_where(ctx);
        std::cout << range.size() << '\n';
        std::cout << "range start: " << static_cast<const void*>(&*range.begin()) << '\n';
    };

    const auto rule = x3::int_[action];

    const std::string input = "432";
    std::cout << "string start: " << static_cast<const void*>(input.data()) << '\n';

    int output;
    x3::phrase_parse(input.begin(), input.end(), rule, x3::space, output);
    std::cout << output << '\n';
}

Output

string start: 0x7ffd65f337c0
0
range start: 0x7ffd65f337c3
432

The length of the range is 0 and begin() of it points to the end of string. When I expand the input string the range covers the remaining unparsed substring.

How can I get the iterator range that contains the parsed substring?


Solution

  • Ah, seeing your code made me remember what I did in the past.

    Basically, you can

    1. use on_error handling on an x3::rule<> and it will give you the matched iterator range. See for an example:

      Live On Coliru

      #include <boost/spirit/home/x3.hpp>
      #include <iostream>
      #include <utility>
      #include <iomanip>
      
      namespace x3 = boost::spirit::x3;
      
      namespace {
          struct ehbase {
              template <typename It, typename Attr, typename Ctx>
                  void on_success(It& f, It const& l, Attr const& attr, Ctx const& /*ctx*/) const {
                      std::cout << "on_succes: " << std::quoted(std::string(f, l)) << " -> " << attr << "\n";
                  }
          };
      
          struct rule_type : ehbase {};
      }
      
      int main() {
          const auto rule = x3::rule<rule_type, int>{"rule"} = x3::int_;
      
          for (std::string const input : { "q", "432", " 646 q" }) {
              std::cout << "== " << std::quoted(input) << " ==\n";
              auto f = begin(input), l = end(input);
              int output;
              if (x3::phrase_parse(f, l, rule, x3::space, output))
                  std::cout << "Parsed " << output << "\n";
              else
                  std::cout << "Parse failed\n";
      
              if (f!=l)
                  std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\n";
          }
      }
      

      Prints

      == "q" ==
      Parse failed
      Remaining: "q"
      == "432" ==
      on_succes: "432" -> 432
      Parsed 432
      == " 646 q" ==
      on_succes: "646" -> 646
      Parsed 646
      Remaining: "q"
      

      On a slight tangent, you can add error-handling in the same vein:

      template <typename It, typename Ctx>
      x3::error_handler_result on_error(It f, It l, x3::expectation_failure<It> const& e, Ctx const& /*ctx*/) const {
          std::cout << std::string(f,l) << "\n"
                    << std::setw(1+std::distance(f, e.where())) << "^"
                    << "-- expected: " << e.which() << "\n";
          return x3::error_handler_result::fail;
      }
      

      If you have an expectation point in the parser:

      const auto rule = x3::rule<rule_type, int>{"rule"} = x3::int_ > x3::eoi;
      

      It now prints: Live On Coliru

      == " 646 q" ==
       646 q
           ^-- expected: eoi
      Parse failed
      Remaining: "646 q"
      
    2. You can use the x3::raw[] directive to expose an iterator range as the attribute:

      Live On Coliru

      #include <boost/spirit/home/x3.hpp>
      #include <iostream>
      #include <utility>
      #include <iomanip>
      
      namespace x3 = boost::spirit::x3;
      
      int main() {
          for (std::string const input : { "q", "432", " 646 q" }) {
              std::cout << "== " << std::quoted(input) << " ==\n";
      
              auto action = [&input](auto& ctx) {
                  auto iters = x3::_attr(ctx);
                  std::cout
                      << input << "\n"
                      << std::setw(std::distance(input.begin(), iters.begin())) << ""
                      << "^ matched: " <<  std::quoted(std::string(iters.begin(), iters.end())) << "\n";
              };
      
              const auto rule = x3::raw[x3::int_] [action];
      
              auto f = begin(input), l = end(input);
              if (x3::phrase_parse(f, l, rule, x3::space))
                  std::cout << "Parse succeeded\n";
              else
                  std::cout << "Parse failed\n";
      
              if (f!=l)
                  std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\n";
          }
      }
      

      Prints:

      == "q" ==
      Parse failed
      Remaining: "q"
      == "432" ==
      432
      ^ matched: "432"
      Parse succeeded
      == " 646 q" ==
       646 q
       ^ matched: "646"
      Parse succeeded
      Remaining: "q"
      

      Again, slightly related, it can become a little more cumbersome to deal with attribute propagation in this approach:

      const auto rule
          = x3::rule<struct _rule, int, true> {"rule"}
          = &x3::raw[x3::int_] [action] >> x3::int_;;
      
      auto f = begin(input), l = end(input);
      int output;
      if (x3::phrase_parse(f, l, rule, x3::space, output))
      
    3. To alleviate the clumsy attribute propagation, you might write a custom parser component that simply wraps another and adds the logic you want:

      template <typename SubjectParser>
      struct verbose : x3::parser<verbose<SubjectParser> > {
          explicit verbose(SubjectParser p, std::string name) : _subject(std::move(p)), _name(std::move(name)) {}
      
          SubjectParser _subject;
          std::string _name;
      
          template <typename It, typename Ctx, typename... Other>
          bool parse(It& f, It l, Ctx& ctx, Other&&... args) const {
              auto saved = f;
              auto ok = x3::as_parser(_subject).parse(f, l, ctx, std::forward<Other>(args)...);
      
              if (ok) {
                  //optionally adjust for skipper
                  x3::skip_over(saved, l, ctx);
                  std::cout << "Debug: " << _name << " matched " << std::quoted(std::string(saved, f)) << "\n";
              }
              return ok;
          }
      };
      

      Now wrapping the parser expression like this:

      const auto rule = verbose {x3::int_, "YUMMY"};
      

      Results in the following output: Live On Coliru

      == "q" ==
      Parse failed
      Remaining: "q"
      == "432" ==
      Debug: YUMMY matched "432"
      Parsed 432
      == " 646 q" ==
      Debug: YUMMY matched "646"
      Parsed 646
      Remaining: "q"
      
    4. Distilling it to that, made me realize that rule-debugging could have been /all that you were looking for/. In which case, simply using BOOST_SPIRIT_X3_DEBUG could be what you needed to know:

      [Live On Coliru

      #define BOOST_SPIRIT_X3_DEBUG
      #include <boost/spirit/home/x3.hpp>
      #include <iomanip>
      
      namespace x3 = boost::spirit::x3;
      
      int main() {
          const auto rule 
              = x3::rule<struct _rule, int> {"rule"}
              = x3::int_;
      
          for (std::string const input : { "q", "432", " 646 q" }) {
              std::cout << "== " << std::quoted(input) << " ==\n";
      
              auto f = begin(input), l = end(input);
              int output;
              if (x3::phrase_parse(f, l, rule, x3::space, output))
                  std::cout << "Parsed " << output << "\n";
              else
                  std::cout << "Parse failed\n";
      
              if (f!=l)
                  std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\n";
          }
      }
      

      Which prints:

      == "q" ==
      <rule>
        <try>q</try>
        <fail/>
      </rule>
      Parse failed
      Remaining: "q"
      == "432" ==
      <rule>
        <try>432</try>
        <success></success>
        <attributes>432</attributes>
      </rule>
      Parsed 432
      == " 646 q" ==
      <rule>
        <try> 646 q</try>
        <success> q</success>
        <attributes>646</attributes>
      </rule>
      Parsed 646
      Remaining: "q"