Search code examples
dateparsinginputantlr4mismatch

ANTLR 4.7.1 mismatch input error for date operation


I am trying to add date operations using ANTLR grammar, I want to add the following option

variable(date) after(date operation) date(value check against)

I have modified the grammar and tried to test the following string

"01012010" AFTER date(01012009)

I am getting the following error

line 1:5 mismatched input 'AFTER' expecting {<EOF>, AND, OR}

I am still learning to work with ANTLR and not sure why this error appears? what is wrong in the grammar file.

Appreciate the help.

the grammar I am using

grammar DateRule;

parse: expr EOF
    ;

expr    
 : expr binop expr                  #logicalExpression
 | lhs=VARIABLE compop rhs=VARIABLE #variableExpression
 | lhs=VARIABLE stringop rhs=STRING #stringExpression
 | lhs=VARIABLE compop rhs=NUMBER   #numberExpression
 | lhs=DATEVARIABLE dateop rhs=DATESTR  #dateExpression
 | TRUE                             #booleanTrue
 | FALSE                            #booleanFalse
 | VARIABLE                         #booleanVariable
 | LEFTPAREN expr RIGHTPAREN        #enclosedExpression
 ;

binop : AND | OR 
 ;

compop: EQUAL | LT | GT | LE | GE | NE 
      ;

stringop: CONT | STARTSWITH | EQUAL | ENDSWITH
      ;

dateop : AFTER | BEFORE ;

TRUE:       'true' | 'TRUE'  ;
FALSE:      'false' | 'FALSE';
STRING:     '"'   ~([\t\n\r]| '"')* '"'
     ;

LEFTPAREN:  '(';   
RIGHTPAREN: ')'; 
CONT  : 'CONTAINS' | 'contains';
STARTSWITH: 'STARTSWITH' | 'startswith' | 'sw' | 'SW';
ENDSWITH: 'ENDSWITH' | 'endswith' ;
AFTER:  'AFTER' | 'after';
BEFORE: 'BEFORE' | 'before';
BETWEEN: 'BETWEEN' | 'between';
DATESTR: 'date''('[0-3][1-9][1-12][2][0][0-9][0-9]')';
EQUAL     : '=' | 'EQ';
LT        : '<' | 'LT';
GT        : '>' | 'GT';
LE       : '<=' | 'LE';
GE       : '>=' | 'GE';
NE        : '!=' | 'NE';
AND       : 'AND' | '&' | 'and';
OR        : 'OR' | 'or' | '|';
VARIABLE  : [a-zA-Z]+[a-zA-Z0-9_.-]*;
NUMBER  : [0-9]+ ('.'[0-9]+)?;
DATEVARIABLE :'"'[0-3][1-9][1-12][2][0][0-9][0-9]'"' ;
SPACE     : [ \t\r\n] -> skip;

Solution

  • "01012010" is tokenized as a STRING, not as a DATEVARIABLE because STRING occurs first in the lexer.

    You could place DATEVARIABLE above STRING:

    DATEVARIABLE :'"'[0-3][1-9][1-12][2][0][0-9][0-9]'"' ;
    STRING:     '"'   ~([\t\n\r]| '"')* '"';
    

    so that DATEVARIABLE gets precedence over STRING.

    You will also need to do something like this then:

    expr    
     : ...
     | lhs=VARIABLE stringop rhs=string #stringExpression
     | ...
     ;
    
    string
     : STRING
     | DATEVARIABLE
     ;