Search code examples
prologcontext-free-grammardcg

Have Prolog return a rule that applies


I'm interested in incrementally parsing the sequence as the terms come in one at a time. This requires that I'm able to determine which of the rewrite rules can consume the current input. So, conceptually I am looking for something like the following.

For the grammar:

abc1 --> [a, b, c]. 
ab --> [a, b].
abc2 --> ab, [c].

In a similar way that we specify arguments that satisfy a goal as variables, I would like to specify the functor itself as a variable e.g., T.

Specifically, I want the functor T([a,b,c]) to return: T = abc1. But then ultimately something recursive like: [abc1, abc2(ab)]

I was thinking to create a top-level rule, which would in its body enumerate all of the grammar's rule heads and perhaps somehow return the trees T that can be constructed by those rules that match the input but this enumeration seems silly since Prolog enumerates the rule heads under the hood anyway and I'm still unsure if I can make this solution work.

Other than that I thought of the lambda package for prolog, but after a glance it doesn't seem to achieve what I need.

Update:

I was pointed in the direction of representing a functor as a list e.g., using =../2 but it seemed that such a solution requires that I explicitly instantiate T to the functor to be called as a list. Whereas, for me the point is to have prolog find that functor by attempting to unify the variable T with available rules until some of them resolve to true with the arguments [a, b, c], [] I supplied.

In other words, I need call(T, [a, b, c],[]). to work but it says T is not sufficiently instantiated.

Update 2:

The solution I'm currently going with is to query all available functors via current_functor to see which ones are satisfied by the input [a, b, c], []. Once I have them I just need that they self-identify e.g., by passing their name up to the arg from the caller, which also seems feasible. Then I'll basically have the names of the rules that matched the input.


Solution

  • I ended up explicitly enumerating all of my grammar's self-identifying rule heads in one top-level rule:

    parse(RuleSubtree, Input, []).
    

    where

    parse(T) -->
        rule1(T);
        rule2(T);
    

    ...

        ruleN(T).
    

    and each rule is self-identified as follows

    ruleN(ruleN(SubRule1, SubRule2, ... SubRuleN)) --> 
        ruleX(SubRule1), ruleY(SubRule2), ... ruleZ(SubRuleN).
    

    which gives me the rule structure that matched the input, in the variable T.

    Maybe a little hardcoded but it works for my purpose and allows me to advance with my experiment, to which there are more serious challenges on the way.