Search code examples
prologdcgimplicit-state-passingdcg-semicontext

Applying semicontext for passing additional arguments


This is a follow-on question from an earlier question from Mat's answer

Starting with this

e([number(0)]           , t1        , Uc0    , Uc0, Bc0    , Bc0) --> [].
e([number(1)]           , t2        , Uc0    , Uc0, Bc0    , Bc0) --> [].
e([number(2)]           , t3        , Uc0    , Uc0, Bc0    , Bc0) --> [].

e([op(neg),[Arg]]       , u1(E)     , [_|Uc0], Uc1, Bc0    , Bc1) --> 
    [_],   
    e(Arg , E , Uc0, Uc1, Bc0, Bc1).
e([op(ln),[Arg]]        , u2(E)     , [_|Uc0], Uc1, Bc0    , Bc1) --> 
    [_],   
    e(Arg , E , Uc0, Uc1, Bc0, Bc1).

e([op(add),[Left,Right]], b1(E0,E1) , Uc0    , Uc2, [_|Bc0], Bc2) --> 
    [_,_], 
    e(Left, E0, Uc0, Uc1, Bc0, Bc1), 
    e(Right, E1, Uc1, Uc2, Bc1, Bc2).
e([op(sub),[Left,Right]], b2(E0,E1) , Uc0    , Uc2, [_|Bc0], Bc2) --> 
    [_,_], 
    e(Left, E0, Uc0, Uc1, Bc0, Bc1), 
    e(Right, E1, Uc1, Uc2, Bc1, Bc2).

e(U,B,EL,Es) :-
    length(UL, U),
    length(BL, B),
    phrase(e(EL,Es,UL,[],BL,[]), _).

e(N,EL,Es) :-
    length([_|Ls], N),
    phrase(e(EL,Es,_,[],_,[]),Ls).

e_count(E, Count) :-
    length([_|Ls], E),
    findall(., phrase(e(_,_,_,[],_,[]), Ls), Sols),
    length(Sols, Count).

then reading

For one variable, use a list with a single element that contains just that variable. If you want to pass around more than one variable, use a list that contains a single term of the form f(...), capturing all variables you want to pass around. This is also well worth its own question.

evolved into this

e( f([number(0)], t1, Uc0, Uc0, Bc0, Bc0) ) --> [].
e( f([number(1)], t2, Uc0, Uc0, Bc0, Bc0) ) --> [].
e( f([number(2)], t3, Uc0, Uc0, Bc0, Bc0) ) --> [].

e( f([op(neg), [Arg]], u1(E), [_|Uc0], Uc1, Bc0, Bc1) ) --> 
    [_], 
    e( f(Arg, E, Uc0, Uc1, Bc0, Bc1) ).
e( f([op(ln) , [Arg]], u2(E), [_|Uc0], Uc1, Bc0, Bc1) ) --> 
    [_], 
    e( f(Arg, E, Uc0, Uc1, Bc0, Bc1) ).

e( f([op(add), [Left,Right]], b1(E0,E1) ,Uc0, Uc2, [_|Bc0], Bc2) ) --> 
    [_,_], 
    e(f(Left, E0, Uc0, Uc1, Bc0, Bc1) ), 
    e(f(Right, E1, Uc1, Uc2, Bc1, Bc2) ).
e( f([op(sub), [Left,Right]], b2(E0,E1) ,Uc0, Uc2, [_|Bc0], Bc2) ) --> 
    [_,_],
    e(f(Left, E0, Uc0, Uc1, Bc0, Bc1) ),
    e(f(Right, E1, Uc1, Uc2, Bc1, Bc2) ).

e(U,B,EL,Es) :-
    length(UL, U),
    length(BL, B),
    phrase(e(f(EL,Es,UL,[],BL,[])), _).

e_N(N,EL,Es) :-
    length([_|Ls], N),
    phrase(e(f(EL,Es,_,[],_,[])),Ls).

e_count(E, Count) :-
    length([_|Ls], E),
    findall(., phrase(e(f(_,_,_,[],_,[])), Ls), Sols),
    length(Sols, Count).

which works but it is noted use a list that contains a single term of the form f(...), and here f(...) is not in a list.

Did I go wrong somewhere?

Supplement

References

Variable names convention with implicit state passing

Normally when using the variables are named

S0S1 →…→ S

However for my uniary-binary tree examples I name them like

S0S1 →…→ Sn

ending with Sn instead of S.

e.g.

standard

e(S0,S) :-

here

 e(S0,S2) :-

the reason being that this example has a rather rare property of each DCG predicate having increasing length,

e.g.

e([number(0)]           , t1        , Uc0    , Uc0, Bc0    , Bc0) -->  
e([op(neg),[Arg]]       , u1(E)     , [_|Uc0], Uc1, Bc0    , Bc1) -->   
e([op(add),[Left,Right]], b1(E0,E1) , Uc0    , Uc2, [_|Bc0], Bc2) -->  

so ending with xn gives me one more check on accuracy.

Reference: ISO/IEC DTR 13211–3:2006 Definite clause grammar rules

6.1.3 Variable names convention for terminal-sequences

This TR uses variables named S0, S1, ..., S to represent the terminal-sequences used as arguments when processing grammar rules or when expanding grammar rules into clauses. In this notation, the variables S0, S1, ..., S can be regarded as a sequence of states, with S0 representing the initial state and the variable S representing the final state. Thus, if the variable Si represents the terminal-sequence in a given state, the variable Si+1 will represent the remaining terminal-sequence after parsing Si with a grammar rule.


Solution

  • A DCG always describes a list.

    But which list? You have to decide how to dedicate the mechanism!

    In the above cases, it seems you have already dedicated DCGs to describe a list whose length you use as a measure for the depth of the search.

    That's completely OK, and a very natural use of DCGs.

    However, you can only describe one list, not two at the same time, at least not in the "primary" way. You can of course describe arbitrarily many terms at the same time via DCG arguments. However, the DCG body is limited to describing just one list, by invoking nonterminals, and using terminals.

    There is a straight-forward way to convert DCGs to regular Prolog code without DCGs, or to interpret DCG rules via regular Prolog. See for example the ISO draft for more information.

    For example, let us take just this snippet:

    e( f([op(neg), [Arg]], u1(E), [_|Uc0], Uc1, Bc0, Bc1) ) --> 
        [_], 
        e( f(Arg, E, Uc0, Uc1, Bc0, Bc1) ).
    e( f([op(ln) , [Arg]], u2(E), [_|Uc0], Uc1, Bc0, Bc1) ) --> 
        [_], 
        e( f(Arg, E, Uc0, Uc1, Bc0, Bc1) ).
    

    Let's get rid of the DCG syntax, and write it for example as:

    e( f([op(neg), [Arg]], u1(E), [_|Uc0], Uc1, Bc0, Bc1, [_|Rest0], Rest) ) :- 
        e( f(Arg, E, Uc0, Uc1, Bc0, Bc1, Rest0, Rest) ).
    e( f([op(ln) , [Arg]], u2(E), [_|Uc0], Uc1, Bc0, Bc1, [_|Rest0], Rest) ) :- 
        e( f(Arg, E, Uc0, Uc1, Bc0, Bc1, Rest0, Rest) ).
    

    (Note of course that you should not rely on any particular expansion method, but always use the phrase/2 interface to invoke a DCG. Still, the above is one way to perform such an expansion, obtaining regular Prolog code.)

    Switching arguments

    Suppose we want to go back to using DCG notation again, because it is so convenient. For example, we obviously need to think about fewer variables when using DCG notation, and that can in itself already be a huge advantage.

    So, we can of course go back to our initial DCG, using terminals instead of our last two new arguments that we introduced to describe a list difference.

    But we can also do something else!

    For example, in the snippet above, we note that Bc0 and Bc1 are only threaded through: None of the clauses really cares about these arguments. So, suppose we dedicate the DCG mechanism to describe these two arguments.

    This could for example look as follows:

    e( f([op(neg), [Arg]], u1(E), [_|Uc0], Uc1, [_|Rest0], Rest) ) -->
        e( f(Arg, E, Uc0, Uc1, Rest0, Rest) ).
    e( f([op(ln) , [Arg]], u2(E), [_|Uc0], Uc1, [_|Rest0], Rest) ) -->
        e( f(Arg, E, Uc0, Uc1, Rest0, Rest) ).
    

    Here, I have started from the regular Prolog version, introduced DCG notation, and simply turned Bc0 and Bc1 into the implicit arguments! They no longer appear at all here. Only if we again expand this back into Prolog do they become visible, or at least that's one way to do it.

    But, there are two problems that remain:

    First, what if we actually want to access these arguments? Certainly not all clauses only thread them through like this. We also need to access them somewhere. And second, there's an even more fundamental problem: We want to talk about a single argument, which can be any term, but DCGs always describe a list!

    So, how do we reconcile all this?

    Most importantly, we need to talk about lists, so the solution is simple: Let us talk about a list with a single element, i.e., the list [Bc0] and [Bc1]. This makes DCG notation applicable at all in this case.

    Next, how do we express the relation between Bc0 and Bc1 within the DCG? For this, we use semicontext notation: It lets as talk about elements that previously were not in the list we are describing.

    As noted in the DCG primer, a nonterminal of the following form will be useful:

    state(S0, S), [S] --> [S0].
    

    You can read the nonterminal state(S0, S) as: The current state is S0, and henceforth it is S.

    Thus, if you want to actually access Bc0 and relate it to Bc1 in one of your clauses, you could do it like this:

    e( f([op(neg), [Arg]], u1(E), [_|Uc0], Uc1, [_|Rest0], Rest) ) -->
        state(Bc0, Bc1),
        ... (something involving Bc0 and Bc1) ...
        e( f(Arg, E, Uc0, Uc1, Rest0, Rest) ).
    

    The major advantage is this: This notation lets you hide the additional arguments, and still lets you access them explicitly if you need it, using state//2 (or semicontext notation directly).

    This obviously becomes more and more attractive the less the arguments are actually used in your code. If almost all of your clauses access the arguments, there is no sense in hiding them. However, quite frequently, you will describe terms that are threaded through while only very few of your DCG clauses actually access them. In such cases, consider using DCG notation to pass them around implicitly.

    But there's still one problem: What to do if we want to pass around not only one term, but two or more? There's a very easy solution for this: Let us still pass around a list with a single element, but let that element simply be a compound term of the form f(Arg1,Arg2,...,Arg_n). Thus, your invocation of your nonterminal e//N might look like this:

    ?- phrase(e(Arg1,Arg2,...,Arg_N), [f(B1,B2,...,B_M)], [Result]).
    

    Here, B1, B2 etc. are the arguments that you would like to pass around implicitly, in the form of a list with a single element that combines all these arguments.

    Assuming this answers your question, a suitable title could be: "Applying semicontext notation for passing additional arguments". The actual question makes clear, and I think that is critical in this case, that you want to apply semicontext notation for passing additional arguments even though you have already dedicated the DCG notation to describe something else. To summarize, a solution for this is to first free the DCG notation so that you can use the described list for passing around additional arguments.

    Note that there are also other solutions: For example, you can devise your own custom notation, completely analogous to DCG notation, but expanded in such a way that it lets you describe two independent lists at the same time. I leave this as an exercise.