Search code examples
pythonparsingcalculatorlexply

How to make a calculator with strings and numbers as mixed input using parser python ply


I would like to ask for help for an exercise to do a calculator which recognizes the English words and numbers in Python but now using PLY (Python Lex-Yacc)

The numbers and the operators can be given in two forms written as a string using English words, "plus" = "+", "two" = 2, "hundred twelve" = 112, etc.

An example could be these entries:

"twenty five divided by 5" or "25 / 5" or "twenty five divided by five"

the result should be the same, a number 5 (not a string).

" -3 times 4" will give -12

Division by 0 will give "Error" " 34 divided by 0" will give "Error"

This should work for several basic operators "-","+","x" and "/" (minus, plus, times and divided by) either if I type the mathematical symbols or I type in text or mixed.

Here are some parts of my code:

# ------- Calculator tokenizing rules

tokens = (
    'NAME','NUMBER', 'times', 'divided_by', 'plus', 'minus'
)

literals = ['=','+','-','*','/', '(',')']

t_ignore = " \t"

t_plus    = r'\+'
t_minus   = r'-'
t_times   = r'\*'
t_divided_by  = r'/'
t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'

  precedence = (
    ('left','+','-'),
    ('left','plus','minus'),
    ('left','times','divided_by'),
    ('left','*','/'),
    ('right','UMINUS'),
)

#Changed here the assigment def p_statement_assign(p): 'statement : expression times divided_by plus minus expression' variables[p[1]] = p[3] p[0] = None

def p_statement_expr(p):
    'statement : expression'
    p[0] = p[1]

def p_expression_binop(p):
    '''expression : expression '+' expression
                  | expression 'plus' expression
                  | expression '-' expression
                  | expression 'minus' expression
                  | expression '*' expression
                  | expression 'times' expression
                  | expression 'divided_by' expression
                  | expression '/' expression'''
    if p[2] ==   '+'  : p[0] = p[1] + p[3]
    elif p[2] == '-': p[0] = p[1] - p[3]
    elif p[2] == '*': p[0] = p[1] * p[3]
    elif p[2] == '/': p[0] = p[1] / p[3]

Are my tokens having a bad definition? How I can tell that the number can be introduce in English letter or with numbers?

The expression (p[2] == '+' : p[0] = p[1] + p[3]) has to have a single character. Why is not valid to write in this form p[2] == 'plus' : p[0] = p[1] + p[3] ?


I have added the code suggested by sfk, but I have still the problem to recognize the numbers and operators entered as text, in english words.

Generating LALR tables
WARNING: 12 shift/reduce conflicts
 Enter your input: calc > one + two
Undefined name 'one'
Undefined name 'two'
P1 is :  0
 Enter your input: calc > 1+2
P1 is :  3
3
 Enter your input: calc > 1 plus 2
Syntax error at 'plus'
P1 is :  2
2

Do you have any idea about what I am doing wrong?


Solution

  • First, add token definition for english words

    t_plustext    = r'plus'
    

    Add those new tokens to tokens

    tokens = (
        'NAME','NUMBER', 'times', 'divided_by', 'plus', 'minus', 'plustext', ....
    )
    

    Finally, use those new token in you grammar this way :

    def p_expression_binop(p):
        '''expression : expression '+' expression
                      | expression plustext expression
        '''
    

    UPDATE : here is a working subset of the grammar

    #!/usr/bin/python
    
    from __future__ import print_function
    
    import sys
    import ply.lex as lex
    import ply.yacc as yacc
    
    # ------- Calculator tokenizing rules
    
    tokens = (
        'NUMBER', 'times', 'divided_by', 'plus', 'minus', 'plustext',
        'one', 'two', 'three',
    )
    
    literals = ['=','+','-','*','/', '(',')']
    
    t_ignore = " \t\n"
    
    t_plustext    = r'plus'
    t_plus    = r'\+'
    t_minus   = r'-'
    t_times   = r'\*'
    t_divided_by  = r'/'
    t_one = 'one'
    t_two = 'two'
    t_three = 'three'
    
    def t_NUMBER(t):
        r'\d+'
        try:
            t.value = int(t.value)
        except ValueError:
            print("Integer value too large %d", t.value)
            t.value = 0
        return t
    
    precedence = (
        ('left','+','-','plustext'),
        ('left','times','divided_by'),
        ('left','*','/'),
    )
    
    
    def p_statement_expr(p):
        'statement : expression'
        p[0] = p[1]
        print(p[1])
    
    def p_expression_binop(p):
        '''expression : expression '+' expression
                      | expression plustext expression
                      | expression '-' expression
                      | expression '*' expression
                      | expression '/' expression'''
        if p[2] ==   '+'  : p[0] = p[1] + p[3]
        elif p[2] == '-': p[0] = p[1] - p[3]
        elif p[2] == '*': p[0] = p[1] * p[3]
        elif p[2] == '/': p[0] = p[1] / p[3]
        elif p[2] == 'plus': p[0] = p[1] + p[3]
    
    def p_statement_lit(p):
        '''expression : NUMBER
              | TXTNUMBER
        '''
        p[0] = p[1]
    
    def p_txtnumber(p):
        '''TXTNUMBER : one
             | two
             | three
        '''
        p[0] = w2n(p[1])
    
    def w2n(s):
        if s == 'one': return 1
        elif s == 'two': return 2
        elif s == 'three': return 3
        assert(False)
        # See http://stackoverflow.com/questions/493174/is-there-a-way-to-convert-number-words-to-integers-python for a complete implementation
    
    def process(data):
        lex.lex()
            yacc.yacc()
            #yacc.parse(data, debug=1, tracking=True)
            yacc.parse(data)
    
    if __name__ == "__main__":
            data = open(sys.argv[1]).read()
            process(data)