Search code examples
context-free-grammar

Parsing consecutive non-terminals


I know I am supposed to have T-> UU, and have the first parse_U only parse "aabb", and the second parse_U would only parse the last "ab", but I cannot figure out how to do this with append. I can only retrieve a sub-list that starts with a and ends with b, but that is not the result I want.

Any help would be greatly appreciated.


Solution

  • For parsing in Prolog, I suggest the use of DCG (Definite Clause Grammar), when available.

    If I'm not wrong, your grammar could simply become

    isS --> isT.
    isS --> isV.
    
    isT --> isU, isU.
    
    isU --> [a], isU, [b].
    isU --> [a, b].
    
    isV --> [a], isV, [b].
    isV --> [a], isW, [b].
    
    isW --> [b], isW, [a].
    isW --> [b, a].
    

    and can be used calling isS(L, []), where L is a list with the sequence to parse.

    Calling

    isS([a,a,b,b,a,b], [])
    

    you should obtain true.

    --- EDIT ---

    this is homework and we are not allowed to use "-->"

    There is nothing special in DGC (use of -->) syntax; it's only a semplification of the usual syntax.

    If I'm not wrong, you can write the DCS syntax above as (caution: undescores added in rules names)

    is_S(Lin, Lout) :- is_T(Lin, Lout).
    is_S(Lin, Lout) :- is_V(Lin, Lout).
    
    is_T(Lin, Lout) :- is_U(Lin, Lmid), is_U(Lmid, Lout).
    
    is_U([a | Tin], Lout)      :- is_U(Tin, [b | Lout]).
    is_U([a, b | Lout], Lout).
    
    is_V([a | Tin], Lout)      :- is_V(Tin, [b | Lout]).
    is_V([a | Tin], Lout)      :- is_W(Tin, [b | Lout]).
    
    is_W([b | Tin], Lout)      :- is_W(Tin, [a | Lout]).
    is_W([b, a | Lout], Lout).
    

    Calling

    is_S([a,a,b,b,a,b], [])
    

    you should obtain true.