Search code examples
javanetbeansantlrantlr4netbeans-platform

NetBeans lexer throws exception recognizing trailing whitespace


I have a simple grammar written in ANTLR4 that includes (among others) a whitespace rule:

WhiteSpace : [ \t\r\n]+ -> skip;

This is integrated into a NetBeans platform application using org.netbeans.spi.lexer.Lexer. When the input has trailing whitespace (before EOF), I get the following exception:

java.lang.IllegalStateException: Lexer ExpressionLexer@2cdea2eb
  returned null token but lexerInput.readLength()=1
  lexer-state: null
  tokenStartOffset=20, readOffset=21, lookaheadOffset=22
  Chars: "\n" - these characters need to be tokenized.
Fix the lexer to not return null token in this state.

How can I make this trailing whitespace not cause an error?

Edit: This works correctly without error using only ANTLR lexer and parser code. The error is only when integrating with the NetBeans lexer (and possibly other integrations).


Solution

  • Change the WhiteSpace rule to send the token to a hidden channel rather than skipping altogether.

    WhiteSpace : [ \t\r\n]+ -> channel(HIDDEN);
    

    The parser won't see the white space, but the NetBeans lexer will be happy that there is a valid token returned for all the input.