Prolog - unifying two lists with/without variables

This Prolog code returns:

?- [a,b,c,d]  =  [a|[b,c,d]].

true

and this one

?- [X,Y] = [a|[b,c,d]].

returns false.

I am not totally grasping why [X, Y] is false. Trace is not helpful here. I would expect the following assignment to hold

X = a
Y = [b,c,d]

and the statement be true.

What does | do besides splitting on head and tail?

Solution

A list in Prolog is implemented as a linked list of functors. If you write a list like [a, b, c, d]. it looks in reality like:

+-------+
| (|)/2 |
+---+---+   +-------+
| o | o---->| (|)/2 |
+-|-+---+   +---+---+   +-------+
  v         | o | o---->| (|)/2 |
  a         +-|-+---+   +---+---+   +-------+
              v         | o | o---->| (|)/2 |
              b         +-|-+---+   +---+---+   
                          v         | o | o----> []
                          c         +-|-+---+
                                      v
                                      d

or in Prolog notation [a | [b | c | [d | [] ] ] ]. The comma-separated list is syntactical sugar: if you write [a, b, c, d], the Prolog interpreter converts it to a representation as above.

Since [b, c, d] is equal to:

[ b | [ c | [ d | [] ] ] ]

and thus [ a | [b, c, d] ] is thus equal to

[a | [b | c | [d | [] ] ] ]

But the list [X, Y] is just equal to

[X, Y] == [ X | [ Y | [] ] ]

or in a structural way:

+-------+
| (|)/2 |
+---+---+   +-------+
| o | o---->| (|)/2 |
+-|-+---+   +---+---+
  v         | o | o----> []
  X         +-|-+---+
              v
              Y

If we then match it with [a | [b | c | [d | [] ] ] ] this means the "outer" shell can be matched, so X = a, but then Y = b, and [] = [ c | [ d | [] ] ]. The last part does not match, and thus it returns false. The X and Y are thus not the problem. The problem is that [] is a constant, and it does not match with the functor that represents [ c | [d] ].

If we would for example unify [ X | Y ] == [a, b, c, d] we get:

?- [ X | Y ] = [a, b, c, d].
X = a,
Y = [b, c, d].

So to conclude, one can say that | itself "does" nothing. It is a functor, just like f(1, 2). In Lisp they used cons [wiki] for this, and nil for the empty list. So [1, 4, 2, 5] looks in Lisp like cons 1 (cons 4 (cons 2 (cons 5 nil))), or in Prolog it would look like cons(1, cons(4, cons(2, cons(5, nil)))). It is only a bit cumbersome to write. In fact the comma separated notation is more the "magic" part. Prolog just performs unification for lists, just like it does for other functors and constants.