Search code examples
treelispcommon-lisp

Definition of tree structure in Lisp


From the Common Lisp HyperSpec glossary:

tree n. 1. a binary recursive data structure made up of conses and atoms: the conses are themselves also trees (sometimes called "subtrees" or "branches"), and the atoms are terminal nodes (sometimes called leaves). Typically, the leaves represent data while the branches establish some relationship among that data. 2. in general, any recursive data structure that has some notion of "branches" and leaves.

tree structure n. (of a tree) the set of conses that make up the tree. Note that while the car[1b] component of each such cons is part of the tree structure, the objects that are the cars of each cons in the tree are not themselves part of its tree structure unless they are also conses.

The last sentence in the definition of tree structure raises a question, which is, can the same be said about cdrs too?

Use of the word binary in the definition of "tree" seems to suggest that there is no difference between car vs cdr for the purposes of trees, but then the definition of "tree structure" seems to treat cars special, so I am confused.


Solution

  • Short answer

    The last sentence in the definition of tree structure raises a question, which is, can the same be said about cdrs too?

    I think the answer is "yes." There's a similar definition for list structure with almost identical wording. There's more potential for confusion about in list structure about whether the value of a car of a cons is part of the list structure, since questions can arise about, e.g., "what does it mean to replace X in the list (X (X) Y)?" The cdr isn't really in question much, since the cdr is the rest of the list; it's sort of obviously part of the list structure.

    For tree structure, I think that there's less ambiguity; cons in the car or the cdr is a subtree. The definitions of tree structure and list structure are almost identical in places, and I wouldn't be surprised if someone wrote the definition for list structure, and then copied it for tree structure, making the minimal number of changes necessary to be accurate. That would leave the bit about cars in there, even though the question that it answers probably wouldn't arise in practice.

    Longer answer

    Let's look at the definition of list structure and compare:

    list structure n. (of a list) the set of conses that make up the list. Note that while the car component of each such cons is part of the list structure, the objects that are elements of the list (i.e., the objects that are the cars of each cons in the list) are not themselves part of its list structure, even if they are conses, except in the (circular) case where the list actually contains one of its tails as an element. (The list structure of a list is sometimes redundantly referred to as its ``top-level list structure'' in order to emphasize that any conses that are elements of the list are not involved.)

    Note the specific places where these differ:

    (list structure) Note that while the car component of each such cons is part of the list structure, the objects that are elements of the list (i.e., the objects that are the cars of each cons in the list) are not themselves part of its list structure, even if they are conses.

    (tree structure) Note that while the car component of each such cons is part of the tree structure, the objects that are the cars of each cons in the tree are not themselves part of its tree structure unless they are also conses.

    That means that in

    (1 (2) 3) == (cons 1 (cons (cons 2 nil) (cons 3 nil)))
    

    there are three cons cells in the list structure, and four cons cells in the tree structure.

    Where does this actually matter? It becomes important to define these terms precisely so that the specification can easily define what parts of a list or tree are traversed or modified by particular functions.

    nsubst can modify tree structure

    For instance, the functions nsubst and friends, whose documentation states:

    nsubst, nsubst-if, and nsubst-if-not might alter the tree structure of tree.

    A specific definition of tree structure allows us to understand what things may and may not be changed by nsubst.

    tree structure n. (of a tree) the set of conses that make up the tree. Note that while the car component of each such cons is part of the tree structure, the objects that are the cars of each cons in the tree are not themselves part of its tree structure unless they are also conses.

    So, what this is telling us is that for any cons cell x in the tree, nsubst might do (setf (car x) …), so that the (car x) might be something different later on, but it won't modify the actual object that would be returned by (car x) (unless it's a cons, of course). This could be important in cases where the value of (car x) is an object that might have trees inside of it. So for instance, nsubst won't descend into vectors, but it will replace vectors:

    (let* ((l (list 1 2 3))     ; a list
           (v (vector 0 l 4))   ; a vector that contains the list (and other elements)
           (tree (cons l v)))   ; a tree containing the list and the vector
      (nsubst 'x l tree))       ; replace the list in the tree with X
    ;=> (X . #(0 (1 2 3) 4))    ; nsubst doesn't descend into the vector, because it's
                                ; not tree structure
    

    delete-duplicates can modify list structure

    On the other hand delete-duplicates will only modify list structure:

    delete-duplicates, when sequence is a list, is permitted to setf any part, car or cdr, of the top-level list structure in that sequence. When sequence is a vector, delete-duplicates is permitted to change the dimensions of the vector and to slide its elements into new positions without permuting them to produce the resulting vector.