Search code examples
listschemeracketequivalence

Comparing structure equality of lists


How can I check the structural equality of two lists in Scheme? For example, (a (b) (c d)) is equal to (a b (c d) (e f g)), and (a b) is equal to (a b c). The data content of the lists does not matter, only the structural hierarchy of the nested lists within, that is, the number of sublists, and the number of sublists of those sublists, and so on.

I made a function called eqStruct that takes two lists as arguments. It is supposed to count the number of sublists in each list, a and b, and then return a boolean. The function looks at each element of list a, and then each element of list b. It uses c and d as counters for the number of sublists in a and b respectively, and these are incremented when atom? returns false when called on an element of the list. After each time the first atom of the list is looked at, the list is set to equal itself without its first item (list-tail), and the unless loops terminate when the whole lists 'a and 'b have been looked at.

(define (eqStruct 'a 'b)

    (define c 0)
    (define d 0)

    (define (atom? x) (not (pair? x)))

    (unless (null? a)
         (unless (atom? (first a)) (set! c (+ c 1))
         (set! a (list-tail a 1))
    )
    (unless (null? b)
         (unless (atom? (first b)) (set! d (+ d 1))
         (set! b (list-tail b 1))
    )

    (eq? c d)
)  

That final line is supposed to be a return statement sort of, because I want the function to be a predicate and return whether the lists are of equal structure, a boolean value, but I was not sure how to make that happen, or if I was even thinking about this problem the right way.


Solution

  • Your approach

    Your code (as well as your comments about it) point to some misconceptions about certain Racket constructs.

    A big issue is that unless is not a looping construct; it simply conditionally executes some code. E.g., if you evaluate (unless (= 1 2) (display 'x)), you don't print x forever, but just print it once. This means that the code you've got won't be doing what you expect it to.

    However, if we assume that was (so that we understand what your code was trying to do), you've got a reasonable idea. You're trying to iterate down the list a and count up how many elements it has that are lists.

        (unless (null? a)
             (unless (atom? (first a)) (set! c (+ c 1))
             (set! a (list-tail a 1))
        )
    

    This is a fine idea if your lists are always going to be just one layer deep. For instance, you'd check ((a b) c) and ((a) b c) correctly, but you wouldn't get ((a (b)) c) (which has structure ((())())) and (((a) (b)) c) (which has structure ((()())()) right. If you just wanted to check whether a and b had the same number of elements that were lists, though (and if unless were a looping construct), you'd be on the right track.

    A slightly different (but similar) approach

    The general approach that you're taking in trying to extract some value that represents the structure of the list (in your case, a number) from each of the input lists and comparing those representative values, is a good one. I think that you just need to use a slightly different representative value, and to compare them a little bit differently (eq? is fine for your case, I think, but the representative value I have in mind needs equal?).

    Note that, given the way they're defined, structures can be compared with equal?. Given that observation, I think the easiest way to do this is probably to write a procedure that takes a list and returns its canonical structure. Also notice that by using list? in the inner if, we properly handle cases like ((a b) ()) which has a structure of (() ()).

    (define (structure list)
      (if (null? list)
          list
          (if (list? (car list))
              (cons (structure (car list)) (structure (cdr list)))
              (structure (cdr list)))))
    
    > (structure '(a (b) (c d)))
    (() ())
    > (structure '((a b) ()))
    (() ())
    

    Then, to compare the structures of two lists, you can just get the structures of the two lists and compare them.

    (define (structure-equal? list1 list2)
      (equal? (structure list1)
              (structure list2)))
    
    > (structure-equal? '(a b) '(a b c))
    #t
    > (structure-equal? '(a (b) (c d)) '(a b (c d) (e f g)))
    #t
    

    Understanding how why equal? works on the structures shouldn't be too hard, but understanding how structure works is more important. Let's look at it more closely

    (define (structure list)
      (if (null? list)
          list ; or '()
          (if (list? (car list))                                    
              (cons (structure (car list)) (structure (cdr list)))
              (structure (cdr list)))))
    

    We're assuming that the list that comes in is, in fact, a list. This means that it's either the empty list () (so (null? list) is true), or else it's a pair that has a car and cdr. The structure of the empty list is obviously the empty list, so we can just return list (because it's empty). The other case is more complicated; list is a pair whose car is some value and whose cdr is the rest of the list. If the car is also a list, then it adds something to the structure of list, but if it's not, then it's just some element and doesn't contribute to the structure of the list. More specifically, when list is the pair (x . y):

    • if x is a list, then the structure of (x . y) is (<structure of x> . <structure of y>)
    • if x is not a list, then the structure of (x . y) is <structure of y>.

    This should explain the logic inside the inner if (we create a pair (a . b) with (cons a b)).