Search code examples
c++peg

Handle correcly state with pegtl grammar


I'm very new to peg and pegtl, so probably I'm missing something. I have a grammar very similar to the following one:

using namespace tao::pegtl;

struct A : one<'A'> { };
struct B : one<'B'> { };

struct comp : seq<plus<sor<seq<A, B>, A>>,eof> { };

template< typename Rule >
struct test_action : nothing< Rule > {};

template<>
struct test_action<A>
{
    template< typename Input >
    static void apply(const Input& in)
    {
        std::cout << "A";
    }
};

template<>
struct test_action<B>
{
    template< typename Input >
    static void apply(const Input& in)
    {
        std::cout << "B";
    }
};

void test()
{
    parse< comp, test_action >(memory_input("AAB", ""));
}

The parse works great, but there is too many activation of test_action::apply. The program outputs "AAAB", because, if I understand well, the parse tries the first alternative (AB) for the first character and fails, then proceeds with the other (A). But even if it "rewinds", it always call test_action::apply. What is the correct way to handle this situation? My intent is to output "AAB", possibly without complicate the grammar.


Solution

  • I asked to pegtl library authors and they kindly give me the correct way: the best thing to do is make your parser construct a parse tree, which is easy to fix when it backtracks using simple push and pop operations.

    I developed the code below for who had similar doubts.

    • avoid backtracking in rules with attached actions:

      using namespace tao::pegtl;
      
      struct A : one<'A'> { };
      struct B : one<'B'> { };
      
      struct real_A : A {};
      struct real_AB : seq<A, B> {};
      
      struct comp : seq<plus<sor<real_AB, real_A>>,eof> { };
      
      template< typename Rule >
      struct test_action : nothing< Rule > {};
      
      template<>
      struct test_action<real_A>
      {
          template< typename Input >
          static void apply(const Input& in)
          {
              std::cout << "A";
          }
      };
      
      template<>
      struct test_action<real_AB>
      {
          template< typename Input >
          static void apply(const Input& in)
          {
              std::cout << "AB";
          }
      };
      
      
      
      void test()
      {
          parse< comp, test_action >(memory_input("AAB", ""));
      }
      
    • build a parse tree:

      using namespace tao::pegtl;
      
      struct A : one<'A'> { };
      struct B : one<'B'> { };
      
      
      struct comp : seq<plus<sor<seq<A, B>, A>>, eof> { };
      
      template< typename Rule >
      struct test_action : nothing< Rule > {};
      
      
      void test()
      {
          auto root = parse_tree::parse<comp>(memory_input("AAB", ""));
      }