Search code examples
parsingprologswi-prologdcg

Prolog - DCG parser with input from file


As part of a project I need to write a parser that can read a file and parse into facts I can use in my program.

The file structure looks as follows:

property = { el1 , el2 , ... }.  

What I want in the end is:

property(el1).
property(el2).
...

I read my file like this:

main :-
       open('myFile.txt', read, Str),
       read_file(Str,Lines),
       close(Str),
       write(Lines), nl.

read_file(Stream,[]) :-
                       at_end_of_stream(Stream).

read_file(Stream,[X|L]) :-
                          \+ at_end_of_stream(Stream),
                          read(Stream,X),
                          parse(X),            % Here I call upon my parser.
                          read_file(Stream,L).

Now I have read in several books and online about DCG, but they all explain the same simple examples where you can generate sentences like "the cat eats the bat" etc... When I want to use it for the above example I fail miserably.

What I did manage was "parsing" the underneath line:

property = el1.

to

property(el1).

with this:

parse(X) :-
           X =.. List,    % Reason I do this is because X is one atom and not a list.
           phrase(sentence(Statement), List),
           asserta(Statement).

sentence(Statement) --> ['=', Gender, Person] , { Statement =.. [Gender, Person] }.

I don't even know if I'm using the dcg in a correct way here, so any on help on this would be appreciated. Now the problem I having is, how to do this with multiple elements in my list, and how to handle '{' and '}'.
What I really want is a dcg that can handle these types of sentences (with more than 2 elements): Sentence split in parts

Now I know many people around here refer to the libraries dcg_basics and pio when it comes to dcgs. However, I have an additional problem that when I try to use the library I receive the error:

ERROR: (c:/users/ldevriendt/documents/prolog/file3.pl:3):
      Type error: `text' expected, found `http/dcg_basics'
Warning: (c:/users/ldevriendt/documents/prolog/file3.pl:3):
      Goal (directive) failed: user:[library(http/dcg_basics)]

when I do this:

:- [library(http/dcg_basics)].

Additional info:

Any help on this would be appreciated!

EDIT: The aim of this is question is to learn more about DCG and its use in parsers.


Solution

  • as long as your file is in plain Prolog syntax, you're advised to use Prolog term IO. Fully structured terms are read with a single call. Using a DCG its' way more complicate, and a bit less efficient (not sure here, should measure, but read(Term) invokes a Prolog parser implemented in C...) See this other question, that uses the very same format (at least, you could check if some other guy got an answer here on SO about your same assignment...)

    edit after comments...

    You're right that DCG are the right way to handle general parse in Prolog. Arguments in DCG productions can be seen as semantic attributes, thus programming DCG can be seen as providing a working semantic analysis on the input (see Attribute Grammar, an important technique -also- in language engineering).

    And indeed the presented examples can perfectly well be solved without the hacks required with term IO.

    Here it is:

    :- use_module(library(pio)).  % autoload(ed), added just for easy browsing
    :- use_module(library(dcg/basics)).
    
    property(P) -->
        b, "my props", b, "=", b, "{", elS(Es) , b, "}", b,
        { P =.. [property|Es] }.
    
    elS([E|Es]) --> el(E), b, ("," -> elS(Es) ; {Es = []}).
    el(N) --> number(N).
    el(S) --> csym(S). % after Jeremy Knees comment...
    b --> blanks.
    
    %   parse a C symbol
    csym(S) -->
        [F], { code_type(F, csymf) },
        csym1(Cs),
        !, { atom_codes(S, [F|Cs]) }.
    
    csym1([C|Cs]) -->
        [C], { code_type(C, csym) },
        csym1(Cs).
    csym1([]) --> [].
    

    with that, we have

    ?- phrase(property(P), "my props = {1,2,3}").
    P = property(1, 2, 3).
    

    Thanks to library(pureio) we can apply semantic programming to Prolog streams, and be rewarded of the same behaviour of phrase/2.

    more

    This other answer show a practical way to implement an expression calculator with operator resolution, and lazy evaluation.