Search code examples
javaparsingantlrlogical-operatorsconditional-operator

ANTLR accepting special characters like . (dot) and , (comma) in identifier or expression


I'm trying to execute dynamic expressions for the input data (Map) using ANTLR and using the answer of @Bart Kiers from the stackoverflow's post I was able to do it.

I tried to add IN, STARTSWITH, ENDSWITH condition with in the grammer and trying to add the logic for it from the java end. For the case of IN function I'm trying to split a set of string using a special character preferably , (comma). But when I add them in my expression I get an error as follows

line 1:64 token recognition error at: ','

Let me know how to allow special characters within the expression.

The grammer file I used

grammar SimpleBoolean;

parse
 : expression EOF
 ;

expression
 : LPAREN expression RPAREN                       #parenExpression
 | NOT expression                                 #notExpression
 | left=expression op=comparator right=expression #comparatorExpression
 | left=expression op=binary right=expression     #binaryExpression
 | bool                                           #boolExpression
 | IDENTIFIER                                     #identifierExpression
 | DECIMAL                                        #decimalExpression
 ;

comparator
 : GT | GE | LT | LE | EQ | NE | IN | NOTIN | STARTSWITH | ENDSWITH | NULL | NOTNULL
 ;

binary
 : AND | OR
 ;

bool
 : TRUE | FALSE
 ;

AND        : 'AND' ;
OR         : 'OR' ;
NOT        : 'NOT';
TRUE       : 'TRUE' ;
FALSE      : 'FALSE' ;
GT         : '>' ;
GE         : '>=' ;
LT         : '<' ;
LE         : '<=' ;
EQ         : '=' ;
NE         : '!=' ;
IN         : 'IN' ;
NOTIN      : 'NOTIN' ;
STARTSWITH : 'STARTSWITH' ;
ENDSWITH   : 'ENDSWITH' ;
NULL       : 'NULL' ;
NOTNULL    : 'NOTNULL' ;
LPAREN     : '(' ;
RPAREN     : ')' ;
DECIMAL    : '-'? [0-9]+ ( '.' [0-9]+ )? ;
IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]* ;
WS         : [ \r\t\u000C\n]+ -> skip;

Modified EvalVisitor.class

Added the below lines

if (ctx.op.EQ() != null) {
          return this.visit(ctx.left).equals(this.visit(ctx.right));
        }
        else if (ctx.op.IN() != null) {
            String checkVal[] = this.visit(ctx.right).toString().split(",");
            boolean valuePresent = false;
            
            for(String value : checkVal) {
                if(value.equals(this.visit(ctx.left).toString()))
                    valuePresent = true;
            }
            return valuePresent;
        }

The Expression I passed,

ID = ID AND NOT ( comments = comments AND system IN system,admin,developer )

Solution

  • Add expression ( ',' expression )+ at the start of the expression rule:

    expression
     : expression ( ',' expression )+ #multipleExpression
     | ...
     ;
    

    EDIT

    After adding the rule above:

    grammar SimpleBoolean;
    
    parse
     : expression EOF
     ;
    
    expression
     : expression ( ',' expression )+                 #multipleExpression
     | LPAREN expression RPAREN                       #parenExpression
     | NOT expression                                 #notExpression
     | left=expression op=comparator right=expression #comparatorExpression
     | left=expression op=binary right=expression     #binaryExpression
     | bool                                           #boolExpression
     | IDENTIFIER                                     #identifierExpression
     | DECIMAL                                        #decimalExpression
     ;
    
    comparator
     : GT | GE | LT | LE | EQ | NE | IN | NOTIN | STARTSWITH | ENDSWITH | NULL | NOTNULL
     ;
    
    binary
     : AND | OR
     ;
    
    bool
     : TRUE | FALSE
     ;
    
    AND        : 'AND' ;
    OR         : 'OR' ;
    NOT        : 'NOT';
    TRUE       : 'TRUE' ;
    FALSE      : 'FALSE' ;
    GT         : '>' ;
    GE         : '>=' ;
    LT         : '<' ;
    LE         : '<=' ;
    EQ         : '=' ;
    NE         : '!=' ;
    IN         : 'IN' ;
    NOTIN      : 'NOTIN' ;
    STARTSWITH : 'STARTSWITH' ;
    ENDSWITH   : 'ENDSWITH' ;
    NULL       : 'NULL' ;
    NOTNULL    : 'NOTNULL' ;
    LPAREN     : '(' ;
    RPAREN     : ')' ;
    DECIMAL    : '-'? [0-9]+ ( '.' [0-9]+ )? ;
    IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]* ;
    WS         : [ \r\t\u000C\n]+ -> skip;
    

    your example input 9355560 = 9355560 AND NOT ( comments = comments AND system IN system,admin,developer ) gets parsed as follows (without errors!):

    enter image description here