Search code examples
prologdcgprolog-assert

Calling facts from database in prolog


I've inserted the given context free grammar into the database using assert(....) If the grammar is something like

S-->a,S,b
S-->c

This grammar is inserted into the database. I have to write a dcg to generate sentences for the cfg in the database. For example if i define the dcg in this way myDcg('S',str), the 'S'(non terminal) should be called or substituted by aSb or c|d or so.

The problem is how can i call/substitute 'S' by facts from the database each time a non terminal('S') is encountered to generate sentences.

Hope you understood my question, if not i will try to edit the question.


Below(Sample code) is what i wanted to do exactly This is not dcg.

myGrammar([], []):-!.

myGrammar([T|Rest], [T|Sentence]):-
          myGrammar(Rest, Sentence).

myGrammar([NT|Rest], Sentence):-
          grammar(NT, Rest1),
          append(Rest1,Rest, NewRest),
          myGrammar(NewRest, Sentence). 

Whenever a terminal is encountered it should be printed out and when a non terminal is encountered it will backtrack.


Solution

  • In your predicate mygrammar/2 there is a list of non-terminals and terminals in the first argument and a list of terminals in the second. It should probably succeed if the second argument is of the form of the first. So what you have here essentially is a meta interpreter for DCGs. A few suggestions:

    Your tokenizer produces currently [grammar('S',[a,'S',b]),grammar('S',[....]),..]. Let it produce [grammar('S',[t(a),nt('S'),t(b)]),grammar('S',[....]),..] instead. In this manner it's evident what is a terminal and what is a non-terminal. And, oh, remove that !.

    myGrammar([], []).
    myGrammar([t(T)|Rest], [T|Sentence]):-
       myGrammar(Rest, Sentence).
    myGrammar([nt(NT)|Rest], Sentence):-
       grammar(NT, Rest1),
       append(Rest1,Rest, NewRest),
       myGrammar(NewRest, Sentence).
    

    DCGs, btw are a bit more general than this interpreter.

    The actual classification between non-terminals and terminals has to be done by the tokenizer.

    uppercasecode(C) :-
       between(0'A,0'Z,C).
    
    lowercasecode(C) :-
       between(0'a,0'z,C).
    

    If you are using chars (one-character atoms), you will use char_code(Char, Code) to convert between them.

    Full Unicode support is still in its infancy. Its very tricky because of all those special cases for characters like Ⓐ which is upper case but still cannot be part of an identifier. But here is how you can do it in SWI currently.

    uppercasecode(C) :-
       '$code_class'(C,upper),
       '$code_class'(C,id_start).
    
    lowercasecode(C) :-
       '$code_class'(C,id_start),
       '$code_class'(C,id_continue),
       \+ '$code_class'(C,upper).
    

    Update: In the meantime, there is char_type/2 and code_type/2 for this purpose.

    uppercasecode(C) :-
       code_class(C, upper),
       code_class(C, prolog_var_start).