Search code examples
prologimplementationswi-prologclpfdlabeling

Understanding the implementation of label/5 in clpfd


I am trying to understand the implementation (I understand the usage) of the label/5 predicate in the clpfd library:

(copied from here)


1824label([], _, _, _, Consistency) :- !,
1825        (   Consistency = upto_in(I0,I) -> I0 = I
1826        ;   true
1827        ).
1828label(Vars, Selection, Order, Choice, Consistency) :-
1829        (   Vars = [V|Vs], nonvar(V) -> label(Vs, Selection, Order, Choice, Consistency)
1830        ;   select_var(Selection, Vars, Var, RVars),
1831            (   var(Var) ->
1832                (   Consistency = upto_in(I0,I), fd_get(Var, _, Ps), all_dead(Ps) ->
1833                    fd_size(Var, Size),
1834                    I1 is I0*Size,
1835                    label(RVars, Selection, Order, Choice, upto_in(I1,I))
1836                ;   Consistency = upto_in, fd_get(Var, _, Ps), all_dead(Ps) ->
1837                    label(RVars, Selection, Order, Choice, Consistency)
1838                ;   choice_order_variable(Choice, Order, Var, RVars, Vars, Selection, Consistency)
1839                )
1840            ;   label(RVars, Selection, Order, Choice, Consistency)
1841            )
1842        ).

especially the marked part (obviously the important part) confuses me:

  • I am not quite sure what fd_get (/3 or /5) does
  • all_dead using some kind of propagator (which I haven't looked into yet tbh)
  • How does this "enumerate" Var?

I am obviously missing some integral parts to the proving mechanism that probably need to be understood for this, so i am curious about some higher level explanation (with maybe a couple of resources to read on) about the behavior of this implementation.


Solution

  • Disclaimer: The following answer is only my educated guess after browsing through the code and playing a bit with it in SWISH.

    First of all, it appears that the entire marked part and the two lines below it (i.e. lines 1836-1837) relate to an undocumented option to labeling/2, which I'll call upto_in (after the name of the functor). I'll try to explain what I believe this option does.

    Normally, when labeling/2 is called, all the FD variables (in its second argument) are ground upon success. This is really what labeling does: assign variables one after the other until they are all assigned. For instance, in the following snippet, labeling/2 succeeds 6 times with both X and Y ground:

    ?- [X,Y] ins 1..4, X #< Y, labeling([],[X,Y]).
    X = 1, Y = 2 ;
    X = 1, Y = 3 ;
    X = 1, Y = 4 ;
    X = 2, Y = 3 ;
    X = 2, Y = 4 ;
    X = 3, Y = 4
    

    As a comparison, when labeling/2 is not used, then we can see the (reduced) domain of the two variables and the fact that a constraint is still pending.

    ?- [X,Y] ins 1..4, X #< Y.
    X in 1..3, X #=< Y + -1, Y in 2..4
    

    When a constraint is pending, it means that each combination of values for the variables (in this case X and Y) might or might not be a solution. Conversely, if no constraint is pending, then we know that every combination of values is a solution. In some applications, then one might then consider stopping labeling when we know for sure that all value combinations are solutions. And in a nutshell, this is what the option upto_in does: It tells to avoid labeling when no constraint is left pending. Back to our running example, this shows as:

    ?- [X,Y] ins 1..4, X #< Y, labeling([upto_in],[X,Y]).
    X = 1, Y in 2..4 ;
    X = 2, Y in 3..4 ;
    X = 3, Y = 4
    

    So the first solution means that for X = 1, Y can take all values from 2 to 4.

    Note that upto_in comes in two flavors, with the first one shown above, and the second one taking an argument, which represents the number of ground solutions contained in the produced answer. So:

    ?- [X,Y] ins 1..4, X #< Y, labeling([upto_in(K)],[X,Y]).
    X = 1, Y in 2..4, K = 3 ;
    X = 2, Y in 3..4, K = 2 ;
    X = 3, Y = 4, K = 1
    

    As another example, if one removes the only constraint, we see that the number of (ground) solution is the Cartesian product of the domains (and no actual labeling takes place).

    ?- [X,Y] ins 1..4, labeling([upto_in(K)],[X,Y]).
    K = 16, X in 1..4, Y in 1..4
    

    This second option may be useful for instance when one wants to count the number of solutions to a problem. Then upto_in/1 allows one to shortcut labeling of "irrelevant" variables for better performance, while keeping track of the count of actual solutions.

    Now, to answer more specific questions:

    • fd_get(V,D,Ps) "returns" for a variable V: its current domain D and a structure Ps with (3 lists of) propagators associated to it. A propagator is basically the internal implementation of a posted constraint, whose role is to remove impossible values from domains. In the examples above, Ps would contain (a representation of?) the propagator for the inequality constraint, and that propagator would have removed in all cases 1 from the domain of Y and 4 from the domain of X. fd_get/5 returns in addition the upper and lower bounds of the domain.
    • all_dead/1 seems to check whether all the propagators associated to a variable are "dead", meaning in this cases that all combinations of values in the domain of the variables appearing in them are solutions. (I say "seem" because this bit really touches to the internals of the clpfd library and I don't want to dig too far.)
    • It does not really enumerate Var. To explain what the marked code does in a sentence, I would write: "If the upto_in option is used and there is no constraint that might reduce the domain of Var any further, then skip enumerating Var (and multiply some accumulator by the size of its domain)."

    Hope this helps your understanding. And if someone more knowledgeable finds mistakes or holes in my explanation, feel free to fix them.