Search code examples
parsingcommentsantlr

Catching (and keeping) all comments with ANTLR


I'm writing a grammar in ANTLR that parses Java source files into ASTs for later analysis. Unlike other parsers (like JavaDoc) I'm trying to keep all of the comments. This is difficult comments can be used literally anywhere in the code. If a comment is somewhere in the source code that doesn't match the grammar, ANTLR can't finish parsing the file.

Is there a way to make ANTLR automatically add any comments it finds to the AST? I know the lexer can simply ignore all of the comments using either {skip();} or by sending the text to the hidden channel. With either of those options set, ANTLR parses the file without any problems at all.

Any ideas are welcome.


Solution

  • Is there a way to make ANTLR automatically add any comments it finds to the AST?

    No, you'll have to sprinkle your entire grammar with extra comments rules to account for all the valid places comments can occur:

    ...
    
    if_stat
     : 'if' comments '(' comments expr comments ')' comments ...
     ;
    
    ...
    
    comments
     : (SingleLineComment | MultiLineComment)*
     ;
    
    SingleLineComment
     : '//' ~('\r' | '\n')*
     ;
    
    MultiLineComment
     : '/*' .* '*/'
     ;