Search code examples
prologclpfdclpclpz

Different ways of computing list length in Prolog


Here are 4 different ways of computing list length in Prolog:

:- use_module(library(clpz)).

list_length1([], 0).
list_length1([_|T], N) :-
    N #> 0,
    N1 #= N - 1,
    list_length1(T, N1).


list_length2(A, N) :-
    N #>= 0,
    list_length2(A, 0, N).

list_length2([], N, N).
list_length2([_|T], I, N) :-
    I #< N,
    I1 #= I + 1,
    list_length2(T, I1, N).


list_length3(A, N) :-
    list_length3(A, 0, N).

list_length3([], N, N).
list_length3([_|T], I, N) :-
    I1 is I + 1,
    list_length3(T, I1, N).


list_length4([], 0).
list_length4([_|T], N) :-
    list_length4(T, N0),
    N is N0 + 1.
  • list_length1 - mathematically straightforward, works in all directions
  • list_length2 - not sure what is this good for, created by analogy
  • list_length3 - last call optimized, works in one direction only
  • list_length4 - recursive, works in one direction

Is there any merit to list_length2 or list_length4?


Solution

  • list_length2 - not sure what is this good for, created by analogy

    To better understand the difference between version 1 and 2, consider a generalization of the predicate. A gross generalization. Simply add the facts list_length1(_,_). and list_length2(_, _, _). in front of their definitions. Now from a declarative viewpoint, these definitions get useless. But from a procedural viewpoint we now get some useful hints:

    ?- list_length1("abc",N).
       true
    ;  clpz:(_A+1#=N), clpz:(N in 1..sup), clpz:(_A in 0..sup)
    ;  clpz:(_A+1#=N), clpz:(_B+1#=_A), clpz:(N in 2..sup), clpz:(_A in 1..sup), clpz:(_B in 0..sup)
    ;  clpz:(_A+1#=N), clpz:(_B+1#=_A), clpz:(_C+1#=_B), clpz:(N in 3..sup), clpz:(_A in 2..sup), clpz:(_B in 1..sup), clpz:(_C in 0..sup)
    ;  N = 3.
    ?- list_length2("abc",N).
       clpz:(N in 0..sup)
    ;  clpz:(N in 1..sup)
    ;  clpz:(N in 2..sup)
    ;  clpz:(N in 3..sup)
    ;  N = 3.
    

    So, version 1 has to create a chain of trivial arithmetical relations and their accompanied variables proportional to the length of the list that will get updated by each further inference and that will get collapsed in the very last moment, whereas version 2 is happy with a single domain (which in this case is infinite) that gets more and more constrained by each step.

    (The operations related to I are effectively all trivial (is)/2-computations.)

    Try ?- list_length2(L,N), N = 10000. to see the difference performance-wise.

    In theory, version 1 could kind of avoid the creation of these trivial relations and update a single variable avoiding the intermediary variables, but so far I have not seen a system capable of doing this. And even if there would be such a system, version 1 would still be slower than version 2 which is just happy with a single variable.

    Is there any merit to list_length4?

    This version produces answers in a very costly, unnecessarily quadratic manner. For each further answer of list_length4(L, N) it needs time proportional to the length of L. Try, e.g. ?- list_length3(L,N), N = 10000. and compare it to version 4.

    And for the mathematical straightforwardness, please consider to use (#)/1 in your programs. So in stead of N #>= 0 just write #N #>= 0.