Search code examples
compiler-constructiontokenantlr4

Lexer tokens are not identified when using antlr4


I'm trying to create a simple parser using ANTLR4 and I have an issue with recognizing a lexer token. The issue is even though there is a ':' after the word SAYS, that is not identified by the parser rule. And also for MENTIONS, @michael is not also identified.

The input text is : john SAYS: hello @michael this will not work

//Lexer Rule

grammar ChatLexer;

/*
 * Lexer Rules
 */
fragment A          : ('A'|'a') ;
fragment S          : ('S'|'s') ;
fragment Y          : ('Y'|'y') ;
fragment H          : ('H'|'h') ;
fragment O          : ('O'|'o') ;
fragment U          : ('U'|'u') ;
fragment T          : ('T'|'t') ;
fragment LOWERCASE  : [a-z] ;
fragment UPPERCASE  : [A-Z] ;
SAYS                : S A Y S ;
SHOUTS              : S H O U T S;
WORD                : (LOWERCASE | UPPERCASE | '_')+ ;
WHITESPACE          : (' ' | '\t') ;
NEWLINE             : ('\r'? '\n' | '\r')+ ;
TEXT                : ('['|'(') ~[\])]+ (']'|')') ;

fragment COLON          : ':';
fragment DASH           : '-';
fragment LEFTBRACKET    : '(';
fragment RIGHTBRACKET   : ')';
fragment LEFTSQRBRACKET : '[';
fragment RIGHTSQRBRACKET: ']';
fragment AT             : '@';
fragment SLASH          : '/';

//Parser Rules

parser grammar ChatParser;

/*
 * Parser Rules
 */
chat                : line+ EOF ;
line                : name command message NEWLINE;
message             : (emoticon | link | color | mention | WORD | WHITESPACE)+ ;
name                : WORD WHITESPACE;
command             : (SAYS | SHOUTS) COLON WHITESPACE ;

emoticon            : COLON DASH? RIGHTBRACKET
                    | COLON DASH? LEFTBRACKET
                    ;
link                : TEXT TEXT ;
color               : SLASH WORD SLASH message SLASH;
mention             : AT WORD ;

This is the parse tree i get from the CHAT rule

I am not following why is ':' and '@' not recognised


Solution

  • A fragment can only be used by other lexer rules, never in parser rules. Remove the fragment keyword from the COLON and AT rules.

    Some background information w.r.t. lexers and parsers: