Search code examples
language-designbnf

BNF grammar for sequence of statements


If I am making a grammar for a c-like language that has a sequence of statements, what is the most standard way for defining the grammar?

My thought is to do something like this:

<program> ::= <statement>
<statement> ::= <statement-head><statement-tail>
<statement-head> ::= <if-statement> | <var-declaration> | <assignment> | <whatever>
<statement-tail> ::= ; | ;<statement>

but that feels a little clunky to me. I've also considered making

<program> ::= <statement>*

or

<statement> ::= <statement-head> ; | <sequence>
<sequence>  ::= <statement> <statement>

type productions.

Is there a standard or accepted way to do this. I want my AST to be as clean as possible.


Solution

  • A very common way would be:

    <block-statement> ::= '{' <statement-list> '}' ;
    <statement-list> ::= /* empty */ | <statement-list> <statement> ;
    <statement> ::= <whatever> ';' ;

    And then you define actual statements instead of typing <whatever>. It seems much cleaner to include trailing semicolons as part of individual statements rather than putting them in the definition for the list non-terminal.