Search code examples
pythonpyparsing

Pyparsing parseaction and lists typerror


For a small language I want to parse expressions of the form "X [Y,Z,V]" where X, Y, Z, V are natural numbers.

Below is my attempt.

from pyparsing import *

class Y():
    def __init__(self, ls):
        self.ls = ls

def MakeCombinedList(tokens):
    print(len(tokens)) # prints 4
    print(tokens)      # [5, 1, 2, 3]
    clist = tokens[1]
    clist.append(tokens[0]) # 'int' attribute object has no attribute 'append'
    return clist

def MakeIntList(tokens):
    nlist = tokens[0].split(",")
    ilist = []
    for n in nlist:
        ilist.append(int(n))
    return ilist

def MakeY(tokens):
    Yobj = Y(tokens[0])
    return Yobj

LEFT_BRACK = Suppress(Literal("["))
RIGHT_BRACK = Suppress(Literal("]"))

NATURAL = Word(nums).addParseAction(lambda n: int(n[0]))
NATURAL_LIST = delimitedList(NATURAL, combine = True)
NATURAL_VEC = LEFT_BRACK + NATURAL_LIST +  RIGHT_BRACK
NATURAL_VEC.addParseAction(MakeIntList)

X = NATURAL + NATURAL_VEC
X.addParseAction(MakeCombinedList)

Y = X
Y.addParseAction(MakeY)


print(Y.parseString("5 [1,2,3]").ls)

MakeIntList is supposed to transform a string such as "1,2,3" into the list [1,2,3].

MakeCombinedList is then supposed to append an integer to this list, but the tokens received by MakeCombinedList are not the single integer and the integer list created from MakeIntList, but a list of all the integers, as indicated by my comment.

How can I make tokens[1] inside MakeCombinedList be the result of calling MakeIntList?


Solution

  • These two lines are working against each other, since you use the first to parse separate numeric strings into ints, and then the second just combines them back into a comma-separated string.

    NATURAL = Word(nums).addParseAction(lambda n: int(n[0]))
    NATURAL_LIST = delimitedList(NATURAL, combine=True)
    

    The feature you are looking for is Group:

    NATURAL = Word(nums).addParseAction(lambda n: int(n[0]))
    NATURAL_LIST = Group(delimitedList(NATURAL))
    NATURAL_VEC = LEFT_BRACK + NATURAL_LIST +  RIGHT_BRACK
    # no MakeIntList parse action required
    

    Now instead of creating a new string and then re-parsing it in a parse action, you use Group to tell pyparsing to make a sub-structure of the resulting tokens.

    There is also a little confusion going on here:

    Y = X
    Y.addParseAction(MakeY)
    

    This will redefine Y from the class defined at the top to a pyparsing expression, and you get some weird traceback when trying to accessing its ls attribute.

    Y_expr = X
    Y_expr.addParseAction(MakeY)
    

    I wrote the runTests method to make it easier to do simple expression testing and printing, without having to deal with Py2/Py3 print differences:

    Y_expr.runTests("""\
        5 [1,2,3]
        """)
    

    Shows:

    5 [1,2,3]
    [<__main__.Y object at 0x00000241C57B7630>]
    

    Since your Y class just uses the default __repr__ behavior, you can see the contents better if you define your own:

    class Y():
        def __init__(self, ls):
            self.ls = ls
        def __repr__(self):
            return "{}: {}".format(type(self).__name__, vars(self))
    

    Now runTests shows:

    5 [1,2,3]
    [Y: {'ls': 5}]
    

    If the purpose of the Y class is to just give you attribute names for your parsed fields, consider using results names instead:

    X = NATURAL('ls') + NATURAL_VEC
    
    Y_expr = X
    #~ Y_expr.addParseAction(MakeY)
    
    # what you had written originally    
    print(Y_expr.parseString("5 [1,2,3]").ls)
    

    Will just print:

    5