prolog evaluation prolog-anonymous-variable sld-resolution

How are anonymous variables interpreted in Prolog?

A quick and simple question regarding what role anonymous variables play in the resolution of a Prolog query given a set of program rule. So, the way I understand how the simplest form of SLD resolution works, an SLD tree is constructed by taking some term from a set of goal terms (based on a selection rule, e.g. FIRST) and going through all the program rules to see which rule's left hand side (the consequent, so to say) can be unified with the term at hand. The way to unify two given terms is to take a difference set of two terms and see if variables can be substituted for terms such that the difference vanishes, you do this by successively taking the leftmost single difference and checking if, out of the two sets constituting the difference, one is a variable not appearing in the other and composing your current substitution with one mapping the variable onto the term (starting with the empty or identity substitution).

Now, when anonymous variables (_) come into play, I suspect the trick in doing it correctly and efficiently lies in changing the way you determine the leftmost difference between two terms to ignore a pair of terms whenever one of them is an anonymous variable. The obviously correct way to do it would be to rename every instance of _ in the goal and the program set to a new variable name and solve using those.

How is it actually done? Is my idea sufficient, or is there more to it than that? (Also, would appreciate it very much if something is missing in the way I understand SLD resolution works, barring negation, call, capsuling, arithmetic predicates and the more complicated stuff.)

Solution

Prolog anonymous variables don't play a role in SLD resolution or in term unification but do play a practical role in Prolog code and Prolog queries. A fundamental aspect of anonymous variables is that each occurrence of an anonymous variable is a different variable. Consider the following query:

| ?- a(_, _) = a(1, 2).

yes

The unification would have failed if the two anonymous variables were the same variable. Now consider the query:

| ?- a(X, _) = a(1, 2).

X = 1
yes

Variable bindings are only reported for variables that are not anonymous variables. This allows using an anonymous variable everytime we are not interested in any bindings for a variable.

Anonymous variables also simplify writing predicate definitions where they similarly act as "don't care" variables. Consider as an example the usual definition of the member/2 predicate:

member(Element, [Element| _]).
member(Element, [_| List]) :-
    member(Element, List).

In the first clause, we don't care about the list tail. In the second clause, we don't care about the list head. By using anonymous variables, we can ignore those sub-terms and avoid the compiler complaining about variables that would be used once in a clause.

Update

Note that all different variables in a query get unique internal variable references, not to be confused with variable names as typed by the user. The variables names are only used by the top-level interpreter to report bindings for successful queries. The inference mechanism used to prove a query use the variable (internal) references. The following query, using the ISO Prolog standard read_term/2 predicate with standard options may help:

| ?- read_term(Term, [variable_names(Names), variables(Variables)]).
a(X, _, Y, _).

Names = ['X'=A,'Y'=B]
Term = a(A,C,B,D)
Variables = [A,C,B,D]

yes

In the term read, there are four distinct variables but only two of them have (user provided) names.