Search code examples
perlparsingparser-combinators

In the Hammer parser combinator, how do you implement a rule that refers to itself?


I'm getting a "Deep recursion on subroutine" warning when I just refer to the name as shown in the code below. use 5.016 … __SUB__->() does not help, either.

Build hints: git clone ; scons bindings=perl test ; cd build/opt/src/bindings/perl ; $EDITOR h.pl

use 5.024;
use strictures;
use blib;
use hammer qw();

# digits = "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
sub digits { hammer::many hammer::in 0..9 }

# product = digits "*" digits
sub product {
    hammer::sequence digits, hammer::ch('*'), digits
}

product->parse('23*42'); # [[2, 3], '*', [4, 2]]


# mproduct = digits "*" mproduct
sub mproduct;
sub mproduct {
    hammer::sequence digits, hammer::ch('*'), mproduct
}
mproduct->parse('8*23*42');
# Deep recursion on subroutine "main::mproduct" at h.pl line 21.

Solution

  • Your code has an infinite loop: mproduct unconditonally calls mproduct.

    Your grammar also has the same an infinite loop: mproduct is unconditionally defined in terms of mproduct.

    mproduct ::= digits '*' mproduct
    

    The grammar you want is

    mproduct ::= digits '*' mproduct
               | digits
    

    or

    mproduct  ::= digits mproduct_
    mproduct_ ::= '*' mproduct_
                | ε
    

    To put it more clearly using pseudo-BNF, the grammar you want is

    myproduct ::= digits ( '*' digits )*
    

    hammer provides a tool to implement this:

    hammer::sepBy1(digits, hammer::ch('*'))
    

    That doesn't address the more general question. The following is a case where the answer to general question would be useful:

    expr ::= sum
    
    sum  ::= term sum_
    sum_ ::= '+' sum_
           | ε
    
    term ::= num
           | '(' expr ')'
    

    The broken approach would be:

    sub num { ... }
    
    sub term {
       hammer::choice(
          num(),
          hammer::sequence(
             hammer::ch('('),
             expr(),
             hammer::ch(')'),
          )
       )
    }
    
    sub sum { hammer::sepBy1(term, hammer::ch('+')) }
    
    sub expr { sum() }
    
    my $parser = expr();
    

    The solve this problem, one would use bind_indirect.

    use feature qw( state );
    
    sub num { ... }
    
    sub term {
       hammer::choice(
          num(),
          hammer::sequence(
             hammer::ch('('),
             expr(),
             hammer::ch(')'),
          )
       )
    }
    
    sub sum { hammer::sepBy1(term, hammer::ch('+')) }
    
    sub _expr { sum() }
    sub expr { state $expr = hammer::indirect(); $expr }
    UNITCHECK { hammer::bind_indirect(expr(), _expr()); }
    
    my $parser = expr();
    

    The test file proved useful in answering this question.

    Nothing is tested.