Search code examples
parsingantlr4grammar

ANTLR Grammar for Recognizing Exponential Notation Numbers and Identifiers


I'm designing an ANTLR grammar that needs to at least recognize numbers in exponential notation, variable identifiers, and strings. I'm running into an issue where the grammar recognizes 1E4 as if E starts an identifier, instead of recognizing it as a number in exponential notation. I'm testing this on ANTLR Lab (lab.antlr.org).

Here's the test grammar I'm using:

grammar prueba;

// Parser rules
program : (numeric_constant | identifier)* EOF;

numeric_constant   : sign? number (EXPONENT sign? integer)? ;
number             : integer ('.' integer?)? ;
integer            : DIGIT+ ;
sign               : PLUS_SIGN | MINUS_SIGN ;
identifier         : LETTER (LETTER | DIGIT)* ;

// Lexer rules
EXPONENT           : 'E' ;
PLUS_SIGN          : '+' ;
MINUS_SIGN         : '-' ;
LETTER             : [A-Za-z] ;
DIGIT              : [0-9] ;

And I'm testing it with the input: 1E4, I'm getting the following error:

1:0 mismatched input '1' expecting {, '+', '-', ',', ';'}

Any insights on why it doesn't recognize 1E4 as a number and how to fix it?

What I tried:

I designed the grammar with the goal of identifying numbers in exponential notation like 1E4. I expected the grammar to recognize 1E4 as a numeric_constant, but instead, it seems to be treating E as the start of an identifier.

What I expected:

Given the input 1E4, I expected it to be parsed as a numeric_constant. However, the parser appears to be treating it as the beginning of an identifier, which isn't what I intended.


Solution

  • Click on the "Lexer" tab in the top-left in ANTLR Lab and delete the content.

    You're specifying a combined lexer/parser grammar, but that Lexer spec is interfering.