II am trying to write parser using prolog.
I have my tokenizer which returns list of tokens. For instance:
Tokens = [key(read),id('N'),sep(:=),int(10),....]
All I need is to make prolog to return set of instruction to run a program.
program = [].
program = [Instructions | Program].
The question is, what is easiest way to BUILD parse tree for given tokens and grammar(like bison
). I would be grateful for any help.
Following up on @mat's suggestion, and using your syntax for a token stream, here's a simple example.
Suppose you have the following BNF defining your language:
<program> ::= program begin <statement_list> end .
<statement_list> ::= <statement> | <statement> ; <statement_list>
<statement> ::= ...
You need to decide how you want to represent your parse tree. I've chosen what seems to be a clunky way to represent the n-ary parse tree as embedded lists (I didn't put a lot of thought into it), but hopefully this will give you the idea of how it works. Your DCG would directly reflect the BNF, and the argument would be the parse tree:
program([key(program), key(begin), AST_statement_list, key(end), sep(.)]) -->
[key(program), key(begin)],
statement_list(AST_statement),
[key(end), sep(.)].
statement_list([AST]) --> statement(AST).
statement_list([AST_statement | AST_statement_list]) -->
statement(AST_statement),
statement_list(AST_statement_list).
statement(AST) --> ...
You would parse by calling your DCG with the token stream:
phrase(program(ParseTree), TokenStream).
A program parse tree would look like:
[key(program), key(begin), [Statement1_List, Statement2_List, ...], key(end), sep(.)]
Each StatementN_List
would itself be an n-ary tree for the statement subtree.