Search code examples
pythonlexerply

List implementation in PLY?


I'm trying to implement a list in parser of the PLY python library. This is the code I have so far-

def p_lisp_list(p):
    "lisp : LPAR array RPAR"
    p[0] = [p[1]]


def p_array_1(p):
    "array : "
    p[0] = [p[1]]


def p_array_2(p):
    "array : lisp array"
    p[0] = [p[1], p[2]]

The 'lisp' term also holds numbers and other expressions not shown.

My understanding is that the bottom rule should function recursively to allow the list behavior and the middle rule should allow for empty lists. I'm confused as to what I'm missing.


Solution

    1. In this reduction function:

      def p_lisp_list(p):
          "lisp : LPAR array RPAR"
          p[0] = [p[1]]
      

      you are setting the result, p[0], to the semantic value of LPAR, since LPAR is the first grammar symbol on the right-hand side. That's certainly not correct; you wanted to set the result to p[2], the value of the array symbol. (Also, the brackets are probably wrong. See point 3 below.)

    2. In this reduction function:

      def p_array_1(p):
          "array : "
          p[0] = [p[1]]
      

      there is no p[1]. The argument p "is a sequence containing the values of each grammar symbol in the corresponding rule" (from the Ply manual, section 6.1. In this case, the corresponding rule is array : , which has no grammar symbols on the right-hand side. So only p[0] is valid, and that refers to the result of the reduction. A common action for empty rules is p[0] = [], but of course there are many other possibilities. As we'll see, the result of this rule is basically what you will be using to represent LISP's NIL (if you're going to create LISP-like lists) and [] is not a bad choice. But you'd actually be better off using a single distinguished immutable object, perhaps the empty tuple (), so that there is no possibility of modifying NIL.

    3. Let's assume that the above two issues have been addressed. Now, consider this reduction function

      def p_array_2(p):
          "array : lisp array"
          p[0] = [p[1], p[2]]
      

      where you are trying to extend an array with another element. Since you're using right-recursion (which is sometimes useful, but not very often), the array is being built up right-to-left:

      [  lisp1 lisp2 lisp3 lisp4       ]
      
                                  ----- => []                  (from p_array_1)
                            lisp  array
                           ------------ => [lisp4, []]         (from p_array_2)
                     lisp  array
                     ----------- => [lisp3, [lisp4, []]]       (from p_array_2)
               lisp  array
               ----------- => [lisp2, [lisp3, [lisp4, []]]]    (from p_array_2)
         lisp  array
         ----------- => [lisp1, [lisp2, [lisp3, [lisp4, []]]]] (from p_array_2) 
      [  array                         ]
      -------- => [[lisp1, [lisp2, [lisp3, [lisp4, []]]]]]     (from p_lisp_list) 
      

    If you were trying to emulate LISP's cons nodes, that's pretty close to correct, aside from being embedded in an extra single-element list by the p_lisp_list rule (which should have just copied the value instead of enclosing it in a one-element list).

    However, if you were trying to create a Python-like array, you might have wanted [lisp1, lisp2, lisp3, lisp4]. You could certainly achieve that with this grammar, but the easiest solution would be to use left-recursion and build the array up in logical order:

    def p_lisp_list:
        ''' lisp : '[' list ']' '''
        p[0] = p[2]
    
    def p_list_empty:
        ''' list : '''
        p[0] = []
    
    def p_list_append:
        ''' list : list lisp '''
        p[0] = p[1]
        p[0].append(p[2])