Search code examples
pythonrubydebuggingparser-generator

Seeking an appropriate Ruby/Python parser generator


The first parser generator I've worked with was Parse::RecDescent, and the guides/tutorials available for it were great, but the most useful feature it has was it's debugging tools, specifically the tracing capabilities ( activated by setting $RD_TRACE to 1 ). I am looking for a parser generator that can help you debug it's rules.

The thing is, it has to be written in Python or in Ruby, and have a verbose mode/trace mode or very helpful debugging techniques.

Does anyone know such a parser generator ?

Edit

When I said debugging, I wasn't referring to debugging Python or Ruby. I was referring to debugging the parser generator, see what it's doing at every step, see every char it's reading, rules it's trying to match.

Edit

I am seeking a parser generator framework, and an illustration of some of its debugging features. I'm not interested in pdb, but in parser's debugging framework. Also, please don't mention treetop. I'm not interested in it.


Solution

  • Python is a pretty easy language to debug. You can just do import pdb pdb.settrace().

    However, these parser generators supposedly come with good debugging facilities.

    http://www.antlr.org/

    http://www.dabeaz.com/ply/

    http://pyparsing.wikispaces.com/

    In response to bounty

    Here is PLY debugging in action.

    Source Code

    tokens = (
        'NAME','NUMBER',
        )
    
    literals = ['=','+','-','*','/', '(',')']
    
    # Tokens
    
    t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
    
    def t_NUMBER(t):
        r'\d+'
        t.value = int(t.value)
        return t
    
    t_ignore = " \t"
    
    def t_newline(t):
        r'\n+'
        t.lexer.lineno += t.value.count("\n")
    
    def t_error(t):
        print("Illegal character '%s'" % t.value[0])
        t.lexer.skip(1)
    
    # Build the lexer
    import ply.lex as lex
    lex.lex(debug=1)
    
    # Parsing rules
    
    precedence = (
        ('left','+','-'),
        ('left','*','/'),
        ('right','UMINUS'),
        )
    
    # dictionary of names
    names = { }
    
    def p_statement_assign(p):
        'statement : NAME "=" expression'
        names[p[1]] = p[3]
    
    def p_statement_expr(p):
        'statement : expression'
        print(p[1])
    
    def p_expression_binop(p):
        '''expression : expression '+' expression
                      | expression '-' expression
                      | expression '*' expression
                      | expression '/' expression'''
        if p[2] == '+'  : p[0] = p[1] + p[3]
        elif p[2] == '-': p[0] = p[1] - p[3]
        elif p[2] == '*': p[0] = p[1] * p[3]
        elif p[2] == '/': p[0] = p[1] / p[3]
    
    def p_expression_uminus(p):
        "expression : '-' expression %prec UMINUS"
        p[0] = -p[2]
    
    def p_expression_group(p):
        "expression : '(' expression ')'"
        p[0] = p[2]
    
    def p_expression_number(p):
        "expression : NUMBER"
        p[0] = p[1]
    
    def p_expression_name(p):
        "expression : NAME"
        try:
            p[0] = names[p[1]]
        except LookupError:
            print("Undefined name '%s'" % p[1])
            p[0] = 0
    
    def p_error(p):
        if p:
            print("Syntax error at '%s'" % p.value)
        else:
            print("Syntax error at EOF")
    
    import ply.yacc as yacc
    yacc.yacc()
    
    import logging
    logging.basicConfig(
        level=logging.INFO,
        filename="parselog.txt"
    )
    
    while 1:
        try:
            s = raw_input('calc > ')
        except EOFError:
            break
        if not s: continue
        yacc.parse(s, debug=1)
    

    Output

    lex: tokens   = ('NAME', 'NUMBER')
    lex: literals = ['=', '+', '-', '*', '/', '(', ')']
    lex: states   = {'INITIAL': 'inclusive'}
    lex: Adding rule t_NUMBER -> '\d+' (state 'INITIAL')
    lex: Adding rule t_newline -> '\n+' (state 'INITIAL')
    lex: Adding rule t_NAME -> '[a-zA-Z_][a-zA-Z0-9_]*' (state 'INITIAL')
    lex: ==== MASTER REGEXS FOLLOW ====
    lex: state 'INITIAL' : regex[0] = '(?P<t_NUMBER>\d+)|(?P<t_newline>\n+)|(?P<t_NAME>[a-zA-Z
    _][a-zA-Z0-9_]*)'
    calc > 2+3
    PLY: PARSE DEBUG START
    
    State  : 0
    Stack  : . LexToken(NUMBER,2,1,0)
    Action : Shift and goto state 3
    
    State  : 3
    Stack  : NUMBER . LexToken(+,'+',1,1)
    Action : Reduce rule [expression -> NUMBER] with [2] and goto state 9
    Result : <int @ 0x1a1896c> (2)
    
    State  : 6
    Stack  : expression . LexToken(+,'+',1,1)
    Action : Shift and goto state 12
    
    State  : 12
    Stack  : expression + . LexToken(NUMBER,3,1,2)
    Action : Shift and goto state 3
    
    State  : 3
    Stack  : expression + NUMBER . $end
    Action : Reduce rule [expression -> NUMBER] with [3] and goto state 9
    Result : <int @ 0x1a18960> (3)
    
    State  : 18
    Stack  : expression + expression . $end
    Action : Reduce rule [expression -> expression + expression] with [2,'+',3] and goto state
     3
    Result : <int @ 0x1a18948> (5)
    
    State  : 6
    Stack  : expression . $end
    Action : Reduce rule [statement -> expression] with [5] and goto state 2
    5
    Result : <NoneType @ 0x1e1ccef4> (None)
    
    State  : 4
    Stack  : statement . $end
    Done   : Returning <NoneType @ 0x1e1ccef4> (None)
    PLY: PARSE DEBUG END
    calc >
    

    Parse Table generated at parser.out

    Created by PLY version 3.2 (http://www.dabeaz.com/ply)
    
    Grammar
    
    Rule 0     S' -> statement
    Rule 1     statement -> NAME = expression
    Rule 2     statement -> expression
    Rule 3     expression -> expression + expression
    Rule 4     expression -> expression - expression
    Rule 5     expression -> expression * expression
    Rule 6     expression -> expression / expression
    Rule 7     expression -> - expression
    Rule 8     expression -> ( expression )
    Rule 9     expression -> NUMBER
    Rule 10    expression -> NAME
    
    Terminals, with rules where they appear
    
    (                    : 8
    )                    : 8
    *                    : 5
    +                    : 3
    -                    : 4 7
    /                    : 6
    =                    : 1
    NAME                 : 1 10
    NUMBER               : 9
    error                : 
    
    Nonterminals, with rules where they appear
    
    expression           : 1 2 3 3 4 4 5 5 6 6 7 8
    statement            : 0
    
    Parsing method: LALR
    
    state 0
    
        (0) S' -> . statement
        (1) statement -> . NAME = expression
        (2) statement -> . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        NAME            shift and go to state 1
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
    
        expression                     shift and go to state 6
        statement                      shift and go to state 4
    
    state 1
    
        (1) statement -> NAME . = expression
        (10) expression -> NAME .
    
        =               shift and go to state 7
        +               reduce using rule 10 (expression -> NAME .)
        -               reduce using rule 10 (expression -> NAME .)
        *               reduce using rule 10 (expression -> NAME .)
        /               reduce using rule 10 (expression -> NAME .)
        $end            reduce using rule 10 (expression -> NAME .)
    
    
    state 2
    
        (7) expression -> - . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 9
    
    state 3
    
        (9) expression -> NUMBER .
    
        +               reduce using rule 9 (expression -> NUMBER .)
        -               reduce using rule 9 (expression -> NUMBER .)
        *               reduce using rule 9 (expression -> NUMBER .)
        /               reduce using rule 9 (expression -> NUMBER .)
        $end            reduce using rule 9 (expression -> NUMBER .)
        )               reduce using rule 9 (expression -> NUMBER .)
    
    
    state 4
    
        (0) S' -> statement .
    
    
    
    state 5
    
        (8) expression -> ( . expression )
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 10
    
    state 6
    
        (2) statement -> expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        $end            reduce using rule 2 (statement -> expression .)
        +               shift and go to state 12
        -               shift and go to state 11
        *               shift and go to state 13
        /               shift and go to state 14
    
    
    state 7
    
        (1) statement -> NAME = . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 15
    
    state 8
    
        (10) expression -> NAME .
    
        +               reduce using rule 10 (expression -> NAME .)
        -               reduce using rule 10 (expression -> NAME .)
        *               reduce using rule 10 (expression -> NAME .)
        /               reduce using rule 10 (expression -> NAME .)
        $end            reduce using rule 10 (expression -> NAME .)
        )               reduce using rule 10 (expression -> NAME .)
    
    
    state 9
    
        (7) expression -> - expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        +               reduce using rule 7 (expression -> - expression .)
        -               reduce using rule 7 (expression -> - expression .)
        *               reduce using rule 7 (expression -> - expression .)
        /               reduce using rule 7 (expression -> - expression .)
        $end            reduce using rule 7 (expression -> - expression .)
        )               reduce using rule 7 (expression -> - expression .)
    
      ! +               [ shift and go to state 12 ]
      ! -               [ shift and go to state 11 ]
      ! *               [ shift and go to state 13 ]
      ! /               [ shift and go to state 14 ]
    
    
    state 10
    
        (8) expression -> ( expression . )
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        )               shift and go to state 16
        +               shift and go to state 12
        -               shift and go to state 11
        *               shift and go to state 13
        /               shift and go to state 14
    
    
    state 11
    
        (4) expression -> expression - . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 17
    
    state 12
    
        (3) expression -> expression + . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 18
    
    state 13
    
        (5) expression -> expression * . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 19
    
    state 14
    
        (6) expression -> expression / . expression
        (3) expression -> . expression + expression
        (4) expression -> . expression - expression
        (5) expression -> . expression * expression
        (6) expression -> . expression / expression
        (7) expression -> . - expression
        (8) expression -> . ( expression )
        (9) expression -> . NUMBER
        (10) expression -> . NAME
    
        -               shift and go to state 2
        (               shift and go to state 5
        NUMBER          shift and go to state 3
        NAME            shift and go to state 8
    
        expression                     shift and go to state 20
    
    state 15
    
        (1) statement -> NAME = expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        $end            reduce using rule 1 (statement -> NAME = expression .)
        +               shift and go to state 12
        -               shift and go to state 11
        *               shift and go to state 13
        /               shift and go to state 14
    
    
    state 16
    
        (8) expression -> ( expression ) .
    
        +               reduce using rule 8 (expression -> ( expression ) .)
        -               reduce using rule 8 (expression -> ( expression ) .)
        *               reduce using rule 8 (expression -> ( expression ) .)
        /               reduce using rule 8 (expression -> ( expression ) .)
        $end            reduce using rule 8 (expression -> ( expression ) .)
        )               reduce using rule 8 (expression -> ( expression ) .)
    
    
    state 17
    
        (4) expression -> expression - expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        +               reduce using rule 4 (expression -> expression - expression .)
        -               reduce using rule 4 (expression -> expression - expression .)
        $end            reduce using rule 4 (expression -> expression - expression .)
        )               reduce using rule 4 (expression -> expression - expression .)
        *               shift and go to state 13
        /               shift and go to state 14
    
      ! *               [ reduce using rule 4 (expression -> expression - expression .) ]
      ! /               [ reduce using rule 4 (expression -> expression - expression .) ]
      ! +               [ shift and go to state 12 ]
      ! -               [ shift and go to state 11 ]
    
    
    state 18
    
        (3) expression -> expression + expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        +               reduce using rule 3 (expression -> expression + expression .)
        -               reduce using rule 3 (expression -> expression + expression .)
        $end            reduce using rule 3 (expression -> expression + expression .)
        )               reduce using rule 3 (expression -> expression + expression .)
        *               shift and go to state 13
        /               shift and go to state 14
    
      ! *               [ reduce using rule 3 (expression -> expression + expression .) ]
      ! /               [ reduce using rule 3 (expression -> expression + expression .) ]
      ! +               [ shift and go to state 12 ]
      ! -               [ shift and go to state 11 ]
    
    
    state 19
    
        (5) expression -> expression * expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        +               reduce using rule 5 (expression -> expression * expression .)
        -               reduce using rule 5 (expression -> expression * expression .)
        *               reduce using rule 5 (expression -> expression * expression .)
        /               reduce using rule 5 (expression -> expression * expression .)
        $end            reduce using rule 5 (expression -> expression * expression .)
        )               reduce using rule 5 (expression -> expression * expression .)
    
      ! +               [ shift and go to state 12 ]
      ! -               [ shift and go to state 11 ]
      ! *               [ shift and go to state 13 ]
      ! /               [ shift and go to state 14 ]
    
    
    state 20
    
        (6) expression -> expression / expression .
        (3) expression -> expression . + expression
        (4) expression -> expression . - expression
        (5) expression -> expression . * expression
        (6) expression -> expression . / expression
    
        +               reduce using rule 6 (expression -> expression / expression .)
        -               reduce using rule 6 (expression -> expression / expression .)
        *               reduce using rule 6 (expression -> expression / expression .)
        /               reduce using rule 6 (expression -> expression / expression .)
        $end            reduce using rule 6 (expression -> expression / expression .)
        )               reduce using rule 6 (expression -> expression / expression .)
    
      ! +               [ shift and go to state 12 ]
      ! -               [ shift and go to state 11 ]
      ! *               [ shift and go to state 13 ]
      ! /               [ shift and go to state 14 ]