Search code examples
parsingprologbnf

How do I represent an alphanumeric string in DCG?


I am trying to write a parser for propositional calculus encoded as S-expressions.

I have made some progress:

expression --> op.
op --> ['('], bin-op, bool, bool, [')'].
op --> ['('], unary-op, bool, [')'].
bool --> tok.
bool --> op.

bin-op --> ["IFF"].
bin-op --> ["IF"].
bin-op --> ["XOR"].
unary-op --> ["NOT"].

tok --> ["a"].

In swipl, I get an appropriate response from calling phrase:

?- phrase(expression, Ls).
Ls = ['(', "IFF", "a", "a", ')'] 

However this is only for the tok "a". Is there a way to say "tok is any alphanumeric string" in DCG? I found this but I'm unsure how to apply it to what I'm doing.


Solution

  • If you just want to parse, then the following token will work.

    tok([A|B], B) :- an_code(A).
    
    alpha_numeric(X) :- 
      between(0'0, 0'9, X); between(0'A, 0'Z, X); between(0'a, 0'z, X).
    an_code(A) :- atom_codes(A, C), maplist(alpha_numeric, C).
    
    ?- phrase(expression, ['(', "IFF", "A1", "1A", ')']).
    true 
    
    ?- phrase(expression, ['(', "IFF", ".A1", "1A", ')']).
    false.
    
    ?- phrase(expression, ['(', "IFF", ".A1", "(1A", ')']).
    false.
    

    With an_code as follows you can generate formulas too :

    an_code(A) :- var(A) ->
                     length(C,L), L >= 1,
                     maplist(alpha_numeric, C),
                     string_codes(A, C);
                   atom_codes(A, C), maplist(alpha_numeric, C).
    
    ?- phrase(expression, Ls).
    Ls = ['(', "IFF", "0", "0", ')'] ;
    Ls = ['(', "IFF", "0", "1", ')'] ;
    Ls = ['(', "IFF", "0", "2", ')'] ;
    
    ?- nth0(1, Ls, "XOR"), phrase(expression, Ls).
    Ls = ['(', "XOR", "0", "0", ')'] ;
    Ls = ['(', "XOR", "0", "1", ')'] ;
    Ls = ['(', "XOR", "0", "2", ')']
    
    ?- nth0(1, Ls, "NOT"), phrase(expression, Ls).
    Ls = ['(', "NOT", "0", ')'] ;
    Ls = ['(', "NOT", "1", ')'] ;
    Ls = ['(', "NOT", "2", ')'] 
    

    In generative version, some predicates used are swi-prolog builtin, so they man not work with other implementations.

    A swi-prolog builtin char_type/2 will also work as alpha_numeric char_type(C, alnum). The following is a dcg style code using swi-prolog predicates.

    tok -->
        [A],
        { string_codes(A, AC),
          maplist([C]>>char_type(C, alnum), AC)
        }.