So I defined a grammar to parse an C style syntax language:
grammar mygrammar;
program
: (declaration)*
(statement)*
EOF
;
declaration
: INT ID '=' expression ';'
;
assignment
: ID '=' expression ';'
;
expression
: expression (op=('*'|'/') expression)*
| expression (op=('+'|'-') expression)*
| relation
| INT
| ID
| '(' expression ')'
;
relation
: expression (op=('<'|'>') expression)*
;
statement
: expression ';'
| ifstatement
| loopstatement
| printstatement
| assignment
;
ifstatement
: IF '(' expression ')' (statement)* FI ';'
;
loopstatement
: LOOP '(' expression ')' (statement)* POOL ';'
;
printstatement
: PRINT '(' expression ')' ';'
;
IF : 'if';
FI : 'fi';
LOOP : 'loop';
POOL : 'pool';
INT : 'int';
PRINT : 'print';
ID : [a-zA-Z][a-zA-Z0-9]*;
INTEGER : [0-9]+;
WS : [ \r\n\t] -> skip;
And I can parse a simple test as this:
int i = (2+3)*3/2*(3+36);
int j = i;
int k = 2*1+i*3;
if (k > 2)
k = k + 1;
i = i / 3;
j = j / 3;
fi;
loop (i < 10)
i = i + 1 * (i+k);
j = (j + 1) * (j-k);
k = i + j;
print(k);
pool;
However, when I want to generate ANTLR Recogonizers in intelliJ, I got this error:
sCalc.g4:19:0: left recursive rule expression contains a left recursive alternative which can be followed by the empty string
I wonder if this is caused by my ID
could be an empty string?
There are a couple of issues with your grammar:
INT
as an alternative inside expression
while you probably want INTEGER
insteadexpression (op=('+'|'-') expression)*
: this will do: expression op=('+'|'-') expression
relation
inside expression
Something like this ought to do it:
grammar mygrammar;
program
: (declaration)*
(statement)*
EOF
;
declaration
: INT ID '=' expression ';'
;
assignment
: ID '=' expression ';'
;
expression
: expression op=('*'|'/') expression
| expression op=('+'|'-') expression
| expression op=('<'|'>') expression
| INTEGER
| ID
| '(' expression ')'
;
statement
: expression ';'
| ifstatement
| loopstatement
| printstatement
| assignment
;
ifstatement
: IF '(' expression ')' (statement)* FI ';'
;
loopstatement
: LOOP '(' expression ')' (statement)* POOL ';'
;
printstatement
: PRINT '(' expression ')' ';'
;
IF : 'if';
FI : 'fi';
LOOP : 'loop';
POOL : 'pool';
INT : 'int';
PRINT : 'print';
ID : [a-zA-Z][a-zA-Z0-9]*;
INTEGER : [0-9]+;
WS : [ \r\n\t] -> skip;
Also not that this (statement)*
can simply be written as statement*