ANTLR4 doesnt parse the .g4 file the way I expected

We are a couple of guys doing a University project, which led us to play around with ANTLR4. We are just trying to figure it out as we go, but have stumbled upon a issue we can’t seem to fix.

We were currently trying to figure out the "moveList" in the grammar, and we have been running a simple test.txt la = Path true ATTACK UP 5;.

Currently we are working with this .g4 file:

grammar hess;

program: line* EOF;

defineBoard: 'BOARD' '(' BOARDPOSITION ')';

line: statement | ifBlock | whileBlock | defineBoard;

statement: (assignment | functionCall) ';';

ifBlock: 'if' expression block ('else' elseIfBlock);

elseIfBlock: block | ifBlock;

whileBlock: 'while' expression block ('else' elseIfBlock);

assignment: IDENTIFIER '=' moveList | IDENTIFIER '=' expression;

functionCall: IDENTIFIER '(' (expression (',' expression))? ')';

expression:
    constant                            # constantExpression
    | IDENTIFIER                        # identifierExpression
    | '(' expression ')'                # parenthesizedExpression
    | '!' expression                    # notExpression
    | expression multOp expression      # multiplicativeExpression
    | expression addOp expression       # additiveExpression
    | expression compareOp expression   # comparisonExpression
    | expression boolOp expression      # booleanExpression;

moveList: move | move moveTail;
moveTail: ',' move moveTail;
moveExtra:
    INTEGER
    | INTEGER direction INTEGER
    | direction INTEGER;
move: Movetype COLLISION Attacktype direction moveExtra;

IDENTIFIER: [a-zA-Z][a-zA-Z0-9];
constant:
    BOARDPOSITION
    | INTEGER
    | FLOAT
    | LETTER
    | STRING
    | BOOL
    | NULL;
BOARDPOSITION: LETTER INTEGER;
INTEGER: [1-9][0-9]* | [0];
FLOAT: [0-9]+ '.' [0-9]+;
STRING: ('"' ~'"'* '"') | ('\'' ~'\''* '\'');
LETTER: [a-zA-Z];

BOOL: 'true' | 'false';
COLLISION: BOOL;
NULL: 'null';

block: '{' line* '}';

WS: [ \t\r\n]+ -> skip;

multOp: '*' | '/' | '%';
addOp: '+' | '-';
boolOp: 'and' | 'or' | 'xor';
compareOp: '==' | '!=' | '>' | '<' | '>=' | '<=';
direction: 'UP' | 'LEFT' | 'RIGHT' | 'DOWN';
Movetype: 'Direct' | 'Path';
Attacktype: 'ATTACK' | 'MOVE' | 'ATKMOVE';

Our thought process, regarding the grammar, would be like so:

line -> statement -> assigment -> moveList

and then

"la" = IDENTIFIER
"Path" = Movetype
"ATTACK" = Attacktype
"UP" = Direction
"5" = INTEGER

We get the following Parse Tree when we debug/run the .g4 file: Parse Tree

As you can see in the Parse Tree it says unexpected values, for what in my eyes look like an expected value.

Solution

The problem is that your COLLISION rule is a lex rule, but the true token in your input is already being tokenised as a BOOL token. If you rename the COLLISION rule in your grammar to collision, like this:

move: Movetype collision Attacktype direction moveExtra;
collision: BOOL;

Then it becomes a parse rule, and the parsing works for your test input:

(program (line (statement (assignment la = (moveList (move Path (collision true) ATTACK (direction UP) (moveExtra 5)))) ;)) <EOF>)