validation antlr antlr4 rule-engine fix-protocol

ANTLR-based rule engine in java

I am writing ANTLRv4 grammar to implement a simple rule engine to parse FIX messages, and specify action to be taken when the rule is violated.

This is where my grammar currently stands:

    grammar RuleDefinition;

ruleset: rule+;

rule :  'tag(' INT ')' numberOp (INT | FLOAT| STRING) (ACTION_DIRECTOR action)?;

ID      :   [a-zA-Z]+ ;        // match identifiers
INT     :   [0-9]+;            // match integers
FLOAT   :   '0'..'9'+('.'('0'..'9')*)? ;            // match float
NEWLINE :'\r'? '\n' ;           // return newlines to parser (end-statement signal)
WS     : [ \t\n\r]+ -> skip ;   // toss out whitespace
NUMBER_OP      :   EQ|GR|GE|LS|LE|NE;
numberOp       :   EQ|GR|GE|LS|LE|NE;
EQ      :   '=';
GR: '>';
GE: '>=';
LS: '<';
LE: '<=';
NE: '!=';
ACTION_DIRECTOR : '->';
action: 'WARN' | 'ERROR';
STRING : '"' (' '..'~')* '"';

The problem is that the generated code is unable to correctly parse when a rule contains an ACTION_DIRECTOR (->), the error I get is "mismatched input 'ERROR' expecting ACTION"

Parsing is successful for:

tag(9)>0

Parsing fails for:

tag(9)>0 -> ERROR

Any pointers on how to correct the above are highly appreciated.

Solution

Look at this three lines:

WARN: 'WARN';
ERROR: 'ERROR';
ACTION: WARN|ERROR;

These are lexer rules (upper case start character). The lexer is responsible to split your input into tokens of a determined type. 'ERROR' can only have one token type and ANTLR decides it to be ERROR (two rules ERROR and ACTION match, and ERRORis first defined).

To resolve this transform some lexer rules to parser rules (lower case start character):

rule :  'tag' '(' INT ')' numberOp (INT | FLOAT| STRING) (ACTION_DIRECTOR action)*;

....

action : WARN | ERROR;
numberOp :   EQ|GR|GE|LS|LE|NE;
stringOp :   EQ|NE;

...

Parser rules compose tokens instead of joining them. That means an action can be WARN or ERROR.