Search code examples
prologswi-prologsuccessor-arithmeticsprolog-cut

What are the optimal green cuts for successor arithmetics sum?


To grok green cuts in Prolog I am trying to add them to the standard definition of sum in successor arithmetics (see predicate plus in What's the SLD tree for this query?). The idea is to "clean up" the output as much as possible by eliminating all useless backtracks (i.e., no ... ; false) while keeping identical behavior under all possible combinations of argument instantiations - all instantiated, one/two/three completely uninstantiated, and all variations including partially instantiated args.

This is what I was able to do while trying to come as close as possible to this ideal (I acknowledge false's answer to how to insert green cuts into append/3 as a source):

natural_number(0).
natural_number(s(X)) :- natural_number(X).

plus(X, Y, X) :- (Y == 0 -> ! ; Y = 0), (X == 0 -> ! ; true), natural_number(X).
plus(X, s(Y), s(Z)) :- plus(X, Y, Z).

Under SWI this seems to work fine for all queries but those with shape ?- plus(+X, -Y, +Z)., as for SWI's notation of predicate description. For instance, ?- plus(s(s(0)), Y, s(s(s(0)))). yields Y = s(0) ; false.. My questions are:

  • How do we prove that the above cuts are (or are not) green?
  • Can we do better than the above program and eliminate also the last backtrack by adding some other green cuts?
  • If yes, how?

Solution

  • First a minor issue: the common definition of plus/3 has the first and second argument exchanged which allows to exploit first-argument indexing. See Program 3.3 of the Art of Prolog. That should also be changed in your previous post. I will call your exchanged definition plusp/3 and your optimized definition pluspo/3. Thus, given

    plusp(X, 0, X) :- natural_number(X).
    plusp(X, s(Y), s(Z)) :- plusp(X, Y, Z).
    

    Detecting red cuts (question one)

    How to prove or disprove red/green cuts? First of all, watch for "write"-unifications in the guard. That is, for any such unifications prior to the cut. In your optimized program:

    pluspo(X, Y, X) :- (Y == 0 -> ! ; Y = 0), (X == 0 -> ! ; true), ...

    I spot the following:

    pluspo(X, Y, X) :- (...... -> ! ; ... ), ...

    So, let us construct a counterexample: To make this cut cut in a red manner, the "write unification" must make its actual guard Y == 0 true. Which means that the goal to construct must contain the constant 0 somehow. There are only two possibilities to consider. The first or third argument. A zero in the last argument means that we have at most one solution, thus no possibility to cut away further solutions. So, the 0 has to be in the first argument! (The second argument must not be 0 right from the beginning, or the "write unification would not have a detrimental effect.). Here is one such counterexample:

    ?- pluspo(0, Y, Y).

    which gives one correct solution Y = 0, but hides all the other ones! So here we have such an evil red cut! And contrast it to the unoptimized program which gave infinitely many solutions:

       Y = 0
    ;  Y = s(0)
    ;  Y = s(s(0))
    ;  Y = s(s(s(0)))
    ;  ... .
    

    So, your program is incomplete, and any questions about further optimizing it are thus not relevant. But we can do better, let me restate the actual definition we want to optimize:

    plus(0, X, X) :- natural_number(X).
    plus(s(X), Y, s(Z)) :- plus(X, Y, Z).
    

    In practically all Prolog systems, there is first-argument indexing, which makes the following query determinate:

    ?- plus(s(0),0,X).
       X = s(0).
    

    But many systems do not support (full) third argument indexing. Thus we get in SWI, YAP, SICStus:

    ?- plus(X, Y, 0).
       X = Y, Y = 0
    ;  false.
    

    What you probably wanted to write is:

    pluso(X, Y, Z) :-
       % Part one: green cuts
       (  X == 0 -> !  % first-argument indexing
       ;  Z == 0 -> !  % 3rd-argument indexing, e.g. Jekejeke, ECLiPSe
       ;  true
       ),
       % Part two: the original unifications
       X = 0,
       Y = Z,
       natural_number(Z).
    pluso(s(X), Y, s(Z)) :- pluso(X, Y, Z).
    

    Note the differences to pluspo/3: There are now only tests prior to the cut! All unifications are thereafter.

    ?- pluso(X, Y, 0).
       X = Y, Y = 0.
    

    The optimizations so far relied only on investigating the heads of the two clauses. They did not take into account the recursive rule. As such, they can be incorporated into a Prolog compiler without any global analysis. In O'Keefe's terminology, these green cuts might be considered blue cuts. To cite The Craft of Prolog, 3.12:

    Blue cuts are there to alert the Prolog system to a determinacy it should have noticed but wouldn't. Blue cuts do not change the visible behavior of the program: all they do is make it feasible.

    Green cuts are there to prune away attempted proofs that would succeed or be irrelevant, or would be bound to fail, but you would not expect the Prolog system to be able to tell that.

    However, the very point is that these cuts do need some guards to work properly.

    Now, you considered another query:

    ?- pluso(X, s(s(0)), s(s(s(0)))).
       X = s(0)
    ;  false.
    

    or to put a simpler case:

    ?- pluso(X, s(0), s(0)).
       X = 0
    ;  false.
    

    Here, both heads apply, thus the system is not able to determine determinism. However, we know that there is no solution to a goal plus(X, s^n, s^m) with n > m. So by considering the model of plus/3 we can further avoid choicepoints. I'll be right back after this break:


    Better use call_semidet/1!

    It gets more and more complex and chances are that optimizations might easily introduce new errors in a program. Also optimized programs are a nightmare to maintain. For practical programming purposes use rather call_semidet/1. It is safe, and will produce a clean error should your assumptions turn out to be false.


    Back to business: Here is a further optimization. If Y and Z are identical, the second clause cannot apply:

    pluso2(X, Y, Z) :-
       % Part one: green cuts
       (  X == 0 -> !  % first-argument indexing
       ;  Z == 0 -> !  % 3rd-argument indexing, e.g. Jekejeke, ECLiPSe
       ;  Y == Z, ground(Z) -> !
       ;  true
       ),
       % Part two: the original unifications
       X = 0,
       Y = Z,
       natural_number(Z).
    pluso2(s(X), Y, s(Z)) :- pluso2(X, Y, Z).
    

    I (currently) believe that pluso2/3 is the optimal usage of green/blue cuts w.r.t. leftover choicepoints. You asked for a proof. Well, I think that is well beyond an SO answer...

    The goal ground(Z) is necessary to ensure the non-termination properties. The goal plus(s(_), Z, Z) does not terminate, that property is preserved by ground(Z). Maybe you think it is a good idea to remove infinite failure branches too? In my experience, this is rather problematic. In particular, if those branches are removed automatically. While at first sight it seems to be a good idea, it makes program development much more brittle: An otherwise benign program change might now disable the optimization and thus "cause" non-termination. But anyway, here we go:

    Beyond simple green cuts

    pluso3(X, Y, Z) :-
       % Part one: green cuts
       (  X == 0 -> !  % first-argument indexing
       ;  Z == 0 -> !  % 3rd-argument indexing, e.g. Jekejeke, ECLiPSe
       ;  Y == Z -> !
       ; var(Z), nonvar(Y), \+ unify_with_occurs_check(Z, Y) -> !, fail
       ; var(Z), nonvar(X), \+ unify_with_occurs_check(Z, X) -> !, fail
       ;  true
       ),
       % Part two: the original unifications
       X = 0,
       Y = Z,
       natural_number(Z).
    pluso3(s(X), Y, s(Z)) :- pluso3(X, Y, Z).
    

    Can you find a case where pluso3/3 does not terminate while there are finitely many answers?