Search code examples
prologprolog-diflogical-purity

Using \==/2 or dif/2


If I want to make sure that two variables do not instantiate to the same term, what is the preferred way to do it?

Let's say I need to find directed edges in a graph, and a node cannot have an edge to itself:

node(a, x, y). node(b, z, x). node(c, y, y).

(the edges here are a -> c, b -> a, but not c -> c)

The following works:

edge(A, B) :- node(A, _, X), node(B, X, _), A \== B.

This works too [swi-prolog]:

edge(A, B) :- dif(A, B), node(A, _, X), node(B, X, _).

This does not work, apparently (because neither A nor B are instantiated yet?):

edge(A, B) :- A \== B, node(A, _, X), node(B, X, _).

I guess my problem with the first solution is that, with a more complex node predicate, a lot of unnecessary unifications might take place before edge fails. The dif on the other hand is in a library, which suggests that it is not meant to be used in such a simple case (although it has the exact function that I seem to be looking for).


Solution

  • For elegance and didactic reasons alone, dif/2 is clearly preferable here and also in the vast majority of other cases, since as you already note "a lot of unnecessary unifications might take place" otherwise, and also because dif/2 is a pure and nicely declarative predicate that can be used in all directions and at any place in the clause body without changing the meaning of the program, in contrast to (\==)/2. dif/2 is also an autoloaded predicate in SWI-Prolog, meaning that you need not import any library explicitly to use it, and dif/2 is available like any built-in predicate.

    If you use dif/2 you can reason much more easily about your code. For example, in your case, you start with:

    edge(A, B) :- node(A, _, X), node(B, X, _), dif(A, B).
    

    and then, as you know that dif/2 is a completely pure predicate, you know that you can also write this as:

    edge(A, B) :- dif(A, B), node(A, _, X), node(B, X, _).
    

    Further, since you know that dif/2 always terminates, you know that this change can at most improve the termination properties of your program.

    Like all constraints, dif/2 is meant to be used. I highly recommend it instead of impure predicates that are not commutative.

    In case you are worried about performance, here is a small comparison, just comparing dif/2 against the non-declarative (\==)/2 in a use case where the two predicates can be used interchangeably:

    ?- N = 1_000_000, time((between(1,N,_),dif(a,b),false)).
    % 11,000,005 inferences, 0.352 CPU in 0.353 seconds (100% CPU, 31281029 Lips)
    
    ?- N = 1_000_000, time((between(1,N,_),a\==b,false)).
    %@ % 3,000,001 inferences, 0.107 CPU in 0.107 seconds (99% CPU, 28167437 Lips)
    

    So, there are sometimes performance benefits when using (\==)/2. However, there are also much more severe drawbacks when using such a low-level predicate: It is harder to understand, more error-prone, and not declarative.

    I therefore recommend to simply use dif/2 to express that two terms are different.