Search code examples
pythonyacclexply

PLY thinks a mathematical expression is a syntax error after I implemented variables


I've been playing around with PLY and after getting the examples to work I decided that I should implement variables. This is working great but now any mathematical expression that is not assigned to a variable seems to throw a syntax error (not a python syntax error but a syntax error within my language). For example:

calc> a = 5
Name: a
Number: 5
Assigning variable a to 5
{'a': 5}
calc> 6 + 1
Number: 6
SYNTACTIC ERROR: line: 1 position: 0 Syntax error: 6
Number: 1
{'a': 5}

I noticed that is I put the number grammar function above that of variable assignments then variables break and calculations work.

Lexer:

#!/usr/bin/env python

### LEXICAL ANALYSIS ###


import ply.lex as lex

import colorama
colorama.init()

tokens = (
    "NUMBER",
    "PLUS",
    "MINUS",
    "MULTIPLY",
    "DIVIDE",
    "LBRACKET",
    "RBRACKET",
    "NAME",
    "EQUALS"
)

t_PLUS = r"\+"
t_MINUS = r"-"
t_MULTIPLY = r"\*"
t_DIVIDE = r"/"
t_LBRACKET = r"\("
t_RBRACKET = r"\)"
t_EQUALS = r"="

t_ignore = "\t\r "

def t_NUMBER(t):
    r"\d+"
    print "Number:", t.value
    t.value = int(t.value)
    return t

def t_NAME(t):
    r"[a-zA-Z]+\w*"
    print "Name:", t.value
    return t

def t_newline(t):
    r"\n+"
    t.lexer.lineno += len(t.value)

def t_COMMENT(t):
    r"\#.*"
    print "Comment:", t.value

def t_error(t):
    print colorama.Fore.RED + "LEXICAL ERROR: line:", t.lexer.lineno, "position:", t.lexer.lexpos, "Invalid token:", t.value, colorama.Fore.RESET
    t.lexer.skip(len(t.value))

lexer = lex.lex()

Parser:

import ply.yacc as yacc

from langlex import tokens

import colorama
colorama.init()

variables = {}

def p_assignment(p):
    "assignment : NAME EQUALS expression"
    print "Assigning variable", p[1], "to", p[3]
    variables[p[1]] = p[3]

def p_expression_plus(p):
    "expression : expression PLUS term"
    p[0] = p[1] + p[3]

def p_expression_minus(p):
    "expression : expression MINUS term"
    p[0] = p[1] - p[3]

def p_expression_term(p):
    "expression : term"
    p[0] = p[1]

def p_expression_name(p):
    "expression : NAME"
    p[0] = variables[p[1]]

def p_term_times(p):
    "term : term MULTIPLY factor"
    p[0] = p[1] * p[3]

def p_term_div(p):
    "term : term DIVIDE factor"
    p[0] = p[1] / p[3]

def p_term_factor(p):
    "term : factor"
    p[0] = p[1]

def p_factor_expr(p):
    "factor : LBRACKET expression RBRACKET"
    p[0] = p[2]

def p_factor_num(p):
    "factor : NUMBER"
    p[0] = p[1]

def p_error(p):
    if(p):
        print colorama.Fore.RED + "SYNTACTIC ERROR: line:", p.lexer.lineno, "position:", p.lexpos, "Syntax error:", p.value, colorama.Fore.RESET
    else:
        print colorama.Fore.RED + "SYNTACTIC ERROR: Unknown syntax error" + colorama.Fore.RESET

parser = yacc.yacc()

while True:
    s = raw_input("calc> ")

    if not(s):
        continue
    result = parser.parse(s)

    if(result):
        print result

    print variables

Solution

  • In creating p_assignment, you have created a new starting grammar symbol. From the docs:

    The first rule defined in the yacc specification determines the starting grammar symbol. Whenever the starting rule is reduced by the parser and no more input is available, parsing stops and the final value is returned.

    http://www.dabeaz.com/ply/ply.html#ply_nn24

    This means, for your grammer, that the only allowed input sentences are assignments:

    $ python p_o.py 
    Generating LALR tables
    calc> a=1
    Name: a
    Number: 1
    Assigning variable a to 1
    {'a': 1}
    calc> a
    Name: a
    SYNTACTIC ERROR: Unknown syntax error
    {'a': 1}
    calc> 1
    Number: 1
    SYNTACTIC ERROR: line: 1 position: 0 Syntax error: 1 
    {'a': 1}
    

    So, we need to have a starting grammar symbol that, by some path, resolves to an expression. I chose to add a statement non-terminal as the starting grammar symbol:

    ...
    def p_statement_assignment(p):
        "statement : assignment"
        pass
    
    def p_statement_expression(p):
        "statement : expression"
        p[0] = p[1]
    
    def p_assignment(p):
        "assignment : NAME EQUALS expression"
        print "Assigning variable", p[1], "to", p[3]
        variables[p[1]] = p[3]
    ...
    
    $ python p_1.py 
    Generating LALR tables
    calc> a=1
    Name: a
    Number: 1
    Assigning variable a to 1
    {'a': 1}
    calc> a
    Name: a
    1
    {'a': 1}
    calc> 1
    Number: 1
    1
    {'a': 1}
    calc>