Search code examples
prologcontext-free-grammardcg

Checking if a string is contained in a language (Prolog)


This is the CFG:

S -> T | V
T -> UU
U -> aUb | ab
V -> aVb | aWb
W -> bWa | ba

so this will accept some form of:

{a^n b^n a^m b^m | n,m >= 1} U {a^n b^m a^m b^n | n,m >= 1}

And here is the code I'm working with:

in_lang([]).  
in_lang(L) :-
    mapS(L), !.

mapS(L) :-
    mapT(L) ; mapV(L),!.

mapT(L) :-
    append(L1, mapU(L), L), mapU(L1), !.

mapU([a|T]) :-
    ((append(L1,[b],T), mapU(L1)) ; (T = b)),!.

mapV([a|T]) :-
    ((append(L1,[b],T), mapV(L1)) ; 
     (append(L1,[b],T), mapW(L1))),
    !.

mapW([b|T]) :-
    ((append(L1,[a],T), mapW(L1)) ;
     (T = a)),
    !.

As of right now, this is returning false for the following three strings:

[a,a,b,b,a,b] // this should be true
[a,a,a,b,b,a,a,b,b,b] // this should be true as well
[a,a,a,b,b,a,b,b,b] // this one IS false

Any help or insight would be greatly appreciated, I'm not too comfortable with Prolog so debugging this by myself has been a challenge.


Solution

  • First, note that this code doesn't make sense:

    ... append(L1, mapU(L), L) ...
    

    In Prolog there are predicates, not functions...

    A CFG production rule (a non terminal) should 'eat' a number of tokens, and in Prolog this means you need at least 2 arguments: the input token list, and what remains after a production has successfully matched the relevant part of input.

    That is, append/3 is not required: just pattern matching, performed by unification operator (=)/2

    mapS(L1, L) :- mapT(L1,L) ; mapV(L1,L).
    mapT(L1, L) :- mapU(L1,L2), mapU(L2,L).
    mapU(L1, L) :- L1=[a|L2], mapU(L2,L3), L3=[b|L] ; L1=[a,b|L].
    ... complete the translation
    

    and then call it:

    ?- mapS([a,a,b,b,a,b],R).
    R = [] ;
    false.
    

    R = [] means the entire sequence has been matched...