Search code examples
prologdcg

A DCG that matches the rest of the input


This is the predicate that does what it should, namely, collect whatever is left on input when part of a DCG:

rest([H|T], [H|T], []).
rest([], [], []).

but I am struggling to define this as a DCG... Or is it at all doable?

This of course is not the same (although it does the same when used in the same manner):

rest([H|T]) --> [H], !, rest(T).
rest([]) --> [].

The reason I think I need this is that the rest//1 is part of a set of DCG rules that I need to parse the input. I could do phrase(foo(T), Input, Rest), but then I would have to call another phrase(bar(T1), Rest).

Say I know that all I have left on input is a string of digits that I want as an integer:

phrase(stuff_n(Stuff, N), `some other stuff, 1324`).

stuff_n(Stuff, N) -->
    stuff(Stuff),
    rest(Rest),
    {   number_codes(N, Rest),
        integer(N)
    }.

Solution

  • Answering my own silly question:

    @CapelliC gave a solution that works (+1). It does something I don't understand :-(, but the real issue was that I did not understand the problem I was trying to solve. The real problem was:

    Problem

    You have as input a code list that you need to parse. The result should be a term. You know quite close to the beginning of this list of codes what the rest looks like. In other words, it begins with a "keyword" that defines the contents. In some cases, after some point in the input, the rest of the contents do not need to be parsed: instead, they are collected in the resulting term as a code list.

    Solution

    One possible solution is to break up the parsing in two calls to phrase/3 (because there is no reason not to?):

    1. Read the keyword (first call to phrase/3) and make it an atom;
    2. Look up in a table what the rest is supposed to look like;
    3. Parse only what needs to be parsed (second call to phrase/3).

    Code

    So, using an approach from (O'Keefe 1990) and taking advantage of library(dcg/basics) available in SWI-Prolog, with a file rest.pl:

    :- use_module(library(dcg/basics)).
    
    codes_term(Codes, Term) :-
        phrase(dcg_basics:nonblanks(Word), Codes, Codes_rest),
        atom_codes(Keyword, Word),
        kw(Keyword, Content, Rest, Term),
        phrase(items(Content), Codes_rest, Rest).
    
    kw(foo, [space, integer(N), space, integer(M)], [], foo(N, M)).
    kw(bar, [], Text, bar(Text)).
    kw(baz, [space, integer(N), space], Rest, baz(N, Rest)).
    
    items([I|Is]) -->
        item(I),
        items(Is).
    items([]) --> [].
    
    item(space) --> " ".
    item(integer(N)) --> dcg_basics:integer(N).
    

    It is important that here, the "rest" does not need to be handled by a DCG rule at all.

    Example use

    This solution is nice because it is deterministic, and very easy to expand: just add clauses to the kw/4 table and item//1 rules. (Note the use of the --traditional flag when starting SWI-Prolog, for double-quote delimited code lists)

    $ swipl --traditional --quiet
    ?- [rest].
    true.
    
    ?- codes_term("foo 22 7", T).
    T = foo(22, 7).
    
    ?- codes_term("bar 22 7", T).
    T = bar([32, 50, 50, 32, 55]).
    
    ?- codes_term("baz 22 7", T).
    T = baz(22, [55]).