I've been playing around with PLY and after getting the examples to work I decided that I should implement variables. This is working great but now any mathematical expression that is not assigned to a variable seems to throw a syntax error (not a python syntax error but a syntax error within my language). For example:
calc> a = 5
Name: a
Number: 5
Assigning variable a to 5
{'a': 5}
calc> 6 + 1
Number: 6
SYNTACTIC ERROR: line: 1 position: 0 Syntax error: 6
Number: 1
{'a': 5}
I noticed that is I put the number grammar function above that of variable assignments then variables break and calculations work.
Lexer:
#!/usr/bin/env python
### LEXICAL ANALYSIS ###
import ply.lex as lex
import colorama
colorama.init()
tokens = (
"NUMBER",
"PLUS",
"MINUS",
"MULTIPLY",
"DIVIDE",
"LBRACKET",
"RBRACKET",
"NAME",
"EQUALS"
)
t_PLUS = r"\+"
t_MINUS = r"-"
t_MULTIPLY = r"\*"
t_DIVIDE = r"/"
t_LBRACKET = r"\("
t_RBRACKET = r"\)"
t_EQUALS = r"="
t_ignore = "\t\r "
def t_NUMBER(t):
r"\d+"
print "Number:", t.value
t.value = int(t.value)
return t
def t_NAME(t):
r"[a-zA-Z]+\w*"
print "Name:", t.value
return t
def t_newline(t):
r"\n+"
t.lexer.lineno += len(t.value)
def t_COMMENT(t):
r"\#.*"
print "Comment:", t.value
def t_error(t):
print colorama.Fore.RED + "LEXICAL ERROR: line:", t.lexer.lineno, "position:", t.lexer.lexpos, "Invalid token:", t.value, colorama.Fore.RESET
t.lexer.skip(len(t.value))
lexer = lex.lex()
Parser:
import ply.yacc as yacc
from langlex import tokens
import colorama
colorama.init()
variables = {}
def p_assignment(p):
"assignment : NAME EQUALS expression"
print "Assigning variable", p[1], "to", p[3]
variables[p[1]] = p[3]
def p_expression_plus(p):
"expression : expression PLUS term"
p[0] = p[1] + p[3]
def p_expression_minus(p):
"expression : expression MINUS term"
p[0] = p[1] - p[3]
def p_expression_term(p):
"expression : term"
p[0] = p[1]
def p_expression_name(p):
"expression : NAME"
p[0] = variables[p[1]]
def p_term_times(p):
"term : term MULTIPLY factor"
p[0] = p[1] * p[3]
def p_term_div(p):
"term : term DIVIDE factor"
p[0] = p[1] / p[3]
def p_term_factor(p):
"term : factor"
p[0] = p[1]
def p_factor_expr(p):
"factor : LBRACKET expression RBRACKET"
p[0] = p[2]
def p_factor_num(p):
"factor : NUMBER"
p[0] = p[1]
def p_error(p):
if(p):
print colorama.Fore.RED + "SYNTACTIC ERROR: line:", p.lexer.lineno, "position:", p.lexpos, "Syntax error:", p.value, colorama.Fore.RESET
else:
print colorama.Fore.RED + "SYNTACTIC ERROR: Unknown syntax error" + colorama.Fore.RESET
parser = yacc.yacc()
while True:
s = raw_input("calc> ")
if not(s):
continue
result = parser.parse(s)
if(result):
print result
print variables
In creating p_assignment
, you have created a new starting grammar symbol. From the docs:
The first rule defined in the yacc specification determines the starting grammar symbol. Whenever the starting rule is reduced by the parser and no more input is available, parsing stops and the final value is returned.
This means, for your grammer, that the only allowed input sentences are assignments:
$ python p_o.py
Generating LALR tables
calc> a=1
Name: a
Number: 1
Assigning variable a to 1
{'a': 1}
calc> a
Name: a
SYNTACTIC ERROR: Unknown syntax error
{'a': 1}
calc> 1
Number: 1
SYNTACTIC ERROR: line: 1 position: 0 Syntax error: 1
{'a': 1}
So, we need to have a starting grammar symbol that, by some path, resolves to an expression. I chose to add a statement
non-terminal as the starting grammar symbol:
...
def p_statement_assignment(p):
"statement : assignment"
pass
def p_statement_expression(p):
"statement : expression"
p[0] = p[1]
def p_assignment(p):
"assignment : NAME EQUALS expression"
print "Assigning variable", p[1], "to", p[3]
variables[p[1]] = p[3]
...
$ python p_1.py
Generating LALR tables
calc> a=1
Name: a
Number: 1
Assigning variable a to 1
{'a': 1}
calc> a
Name: a
1
{'a': 1}
calc> 1
Number: 1
1
{'a': 1}
calc>