Search code examples
prologdcg

Calling phrase/2 on a list of grammar rules, rather than a list of atoms


With a grammar like this:

as --> [].
as --> [a], as.
b(b(a)) --> as.
c(X) --> b(X).

phrase(b(X), [a, a]) & phrase(c(X), [a, a]) work no problem. Both return X = b(a)..

But is it possible to unify something like this?

phrase(c(X), [b(a)]).
OR:
phrase(c(X), [b(b(a))]).

I've had no luck, though it seems like it should be possible. I think this means I have some misunderstanding about DCGs. I know they work through difference lists, and succeed when they can apply a series of rules to consume all the elements of the list. Is there something special about atoms or tokens as far as phrase/2 is concerned?

Here's what the trace looks like:

[trace] [4]  ?- phrase(c(X), [b(a)]).
   Call: (5,411) c(_57162, [b(a)], []) ? creep
   Call: (5,412) b(_57162, [b(a)], []) ? creep
   Call: (5,413) [b(a)]as[] ? creep
   Fail: (5,413) [b(a)]as[] ? creep
   Fail: (5,412) b(_57162, [b(a)], []) ? creep
   Fail: (5,411) c(_57162, [b(a)], []) ? creep
false.

The failure at [b(a)]as[] is interesting to me. What is being tested there? And... what does that syntax mean?

I know some people find it easier to answer a question when they understand a little more about what I'm actually trying to accomplish, so:

Occasionally in my application I'd like to rephrase inelegant sentences. For instance, sometimes I'll come across something like ["bake", "that" "many", "pie", "plus", "one"]. I'd like the DCG to treat this as though it'd encountered ["bake", "x", "pie"] instead. So I could use phrase(foo(X), ["x", "pie"]), and if that succeeds, then I might like to wrap it in another rule higher up in the grammar. (Why not just call the higher rule to start? Some of the higher rules are very slow in my grammar, so I'm trying to "fail quickly" by testing a lower-level grammar rule first.)

Thanks in advance for any advice!


Solution

  • Let's align the grammar rules to better show what they mean.

    Note that they are doing list processing and consume the elements within brackets if those unify with the prefix of the list. (conversely, they may generate the elements within brackets if your run the DCG "in reverse")

    as      --> [].           % consumes nothing
    as      --> [a], as.      % consumes an atom 'a', then calls rule as//0
    b(b(a)) -->      as.      % consumes nothing, then calls rule as//0
    c(X)    -->      b(X).    % consumes nothing, then calls rule b//1
    

    The arguments within parentheses are standard Prolog predicate arguments.

    When you call

    phrase(b(X), [a, a])
    

    This is a successful path through the tree of possibilities:

    • Use b(X) to consume a prefix of [a,a]:
      • X is unified with b(a) and as//0 is called, nothing is consumed.
    • Use the second clause of as to consume a prefix of [a,a]:
      • a is consumed and as//0 is called
    • Use the second clause of as to consume a prefix of [a]:
      • a is consumed and as//0 is called
    • Use the first clause of as to consume a prefix of []:
      • Succeeds without further rule calls

    Thus the phrase/2 call succeeds with X=b(a), but the b(a) has nothing nothing to do with the b and a of the rules.

    What about phrase(c(X), [b(a)]).?

    • Use c(X) to consume a prefix of [b(a)]:
      • Nothing is consumed, b(X) is called.
    • Use b(X) to consume a prefix of [b(a)]: -X is unified with b(a) (of the head, not of the list), nothing is consumed, as is called.
    • Use as to consume a prefix of [b(a)]:
      • The first clause applies as it consumes nothing, but going down that path eventually will result in failure as we used phrase/2, demanding that the list be empty on success (rather than phrase/3 where we could leave a Rest argument unbound).
      • The second clause does not apply as it consumes a but the list is [b(a)]

    Failure.

    Similarly for phrase(c(X), [b(b(a))]).

    The tracer output:

    Call: (5,413) [b(a)]as[] ? creep
    Fail: (5,413) [b(a)]as[] ? creep
    

    seems to indicate that, via the only rules that applies, namely the first:

    as      --> [].
    

    the actual list content [b(a)] must be equal to [] (because due to the phrase/2 call we want to have an empty "rest"). You may be seeing an effect of the optimizer.