Different ways of computing list length in Prolog

Here are 4 different ways of computing list length in Prolog:

:- use_module(library(clpz)).

list_length1([], 0).
list_length1([_|T], N) :-
    N #> 0,
    N1 #= N - 1,
    list_length1(T, N1).


list_length2(A, N) :-
    N #>= 0,
    list_length2(A, 0, N).

list_length2([], N, N).
list_length2([_|T], I, N) :-
    I #< N,
    I1 #= I + 1,
    list_length2(T, I1, N).


list_length3(A, N) :-
    list_length3(A, 0, N).

list_length3([], N, N).
list_length3([_|T], I, N) :-
    I1 is I + 1,
    list_length3(T, I1, N).


list_length4([], 0).
list_length4([_|T], N) :-
    list_length4(T, N0),
    N is N0 + 1.

list_length1 - mathematically straightforward, works in all directions
list_length2 - not sure what is this good for, created by analogy
list_length3 - last call optimized, works in one direction only
list_length4 - recursive, works in one direction

Is there any merit to list_length2 or list_length4?

Solution

list_length2 - not sure what is this good for, created by analogy

To better understand the difference between version 1 and 2, consider a generalization of the predicate. A gross generalization. Simply add the facts list_length1(_,_). and list_length2(_, _, _). in front of their definitions. Now from a declarative viewpoint, these definitions get useless. But from a procedural viewpoint we now get some useful hints:

?- list_length1("abc",N).
   true
;  clpz:(_A+1#=N), clpz:(N in 1..sup), clpz:(_A in 0..sup)
;  clpz:(_A+1#=N), clpz:(_B+1#=_A), clpz:(N in 2..sup), clpz:(_A in 1..sup), clpz:(_B in 0..sup)
;  clpz:(_A+1#=N), clpz:(_B+1#=_A), clpz:(_C+1#=_B), clpz:(N in 3..sup), clpz:(_A in 2..sup), clpz:(_B in 1..sup), clpz:(_C in 0..sup)
;  N = 3.
?- list_length2("abc",N).
   clpz:(N in 0..sup)
;  clpz:(N in 1..sup)
;  clpz:(N in 2..sup)
;  clpz:(N in 3..sup)
;  N = 3.

So, version 1 has to create a chain of trivial arithmetical relations and their accompanied variables proportional to the length of the list that will get updated by each further inference and that will get collapsed in the very last moment, whereas version 2 is happy with a single domain (which in this case is infinite) that gets more and more constrained by each step.

(The operations related to I are effectively all trivial (is)/2-computations.)

Try ?- list_length2(L,N), N = 10000. to see the difference performance-wise.

In theory, version 1 could kind of avoid the creation of these trivial relations and update a single variable avoiding the intermediary variables, but so far I have not seen a system capable of doing this. And even if there would be such a system, version 1 would still be slower than version 2 which is just happy with a single variable.

Is there any merit to list_length4?

This version produces answers in a very costly, unnecessarily quadratic manner. For each further answer of list_length4(L, N) it needs time proportional to the length of L. Try, e.g. ?- list_length3(L,N), N = 10000. and compare it to version 4.

And for the mathematical straightforwardness, please consider to use (#)/1 in your programs. So in stead of N #>= 0 just write #N #>= 0.