Search code examples
compiler-constructionantlrantlr4compiler-optimization

ANTLR doesn't recognize the wrong keyword


I'm noob in ANTLR. I'm building a compiler for a simple language, but I don't understand why, if I write the wrong keybord, the compiler doesn't give me the right error.

That's my grammar:

            grammar Exercise;

            block  : '{' statement* '}';

            statement : assignment ';' 
                      | deletion ';' 
                      | print ';'
                      | declaration* ';'
                      | ifStat
                      | functionDecl
                      | exp
                      | block+
                      ;


            assignment : ID '=' exp;

            type  : 'int'
                  | 'boolean'
                  ;

            typeF  : 'void' ;


            declaration : type ID ;

            deletion : 'delete' ID;

            print  : 'print' exp;

            bool  : 'true' 
                  | 'false' 
                  ;

            exp : '(' exp ')'   
                | ID '(' expList? ')'
                | NUMBER 
                | bool
                | ID
                ;

            expIF   : ID EQ ID
                    | ID EQ bool
                    | ID GT ID
                    | ID LT ID
                    | ID NEQ ID 
                    | ID GTEQ ID
                    | ID LTEQ ID
                    | NOT ID
                    | ID 
                    ;

            ifStat  : 'if' '('expIF')' 'then' block ('else' block)? ;

            formalParameter  : declaration 
                             | rif declaration
                             ;

            rif : 'var';

            formalParameters    : formalParameter (',' formalParameter)* ; 

            functionDecl    : typeF ID LPAR formalParameters? RPAR block ; 

            expList : ID (',' ID )* ; 

            //IDs
            fragment CHAR  : 'a'..'z' |'A'..'Z' ;
            ID              : (CHAR)+ | (DIGIT)+ ;

            //Numbers
            fragment DIGIT : '0'..'9'; 
            NUMBER          : DIGIT+;

            OR : '||';
            AND : '&&';
            NOT : '!';
            EQ : '==';
            NEQ : '!=';
            GT : '>';
            LT : '<';
            GTEQ : '>=';
            LTEQ : '<=';
            LPAR : '(';
            RPAR : ')';

            //ESCAPE SEQUENCES
            WS              : (' '|'\t'|'\n'|'\r')-> skip;
            LINECOMMENTS  : '//' (~('\n'|'\r'))* -> skip;
            BLOCKCOMMENTS    : '/*'( ~('/'|'*')|'/'~'*'|'*'~'/'|BLOCKCOMMENTS)* '*/' -> skip;
            ERR: . -> channel(HIDDEN);

Here my main:

   public static void main(String[] args) {

   //create lexer

       ExerciseLexer lexer = new ExerciseLexer(new ANTLRInputStream("{ double a ;  boolean d; a = 4 ; {boolean d ; int a} int s;}") );

   //create parser
       CommonTokenStream tokens = new CommonTokenStream(lexer);
       ExerciseParser parser = new ExerciseParser(tokens);

   //tell the parser to build the AST
        parser.setBuildParseTree(true);

    //build custom visitor
        ExerciseVisitorImpl visitor = new ExerciseVisitorImpl();
        ParseTree pt = parser.block();
        visitor.visit(pt);  

For example, in this case, I should get an error for the "double" keyword, but I get "line 1:51 extraneous input '}' expecting {'boolean', ';', 'int'}". What is the problem? Thank you so much!


Solution

  • In your grammar, a statement is an exp. You probably meant exp ';'.

    As written, a block is statement* and that can match exp exp. Since ID is an exp and double and a are both IDs, double a is recognised as two consecutive statements.

    Also, your grammar recognises declaration* ';' as a statement. Since declaration* includes the case of zero declarations -- that is, the empty string -- a lone ; matches that production. I don't know if that is really what you want, but I strongly suspect that you did not want to match two consecutive declaration without a semicolon separating them.