Search code examples
antlrantlr4vlang

ANTLR: not match if a certain character follows


Following code is completely valid in the V programming language:

fn main() {
    a := 1.
    b := .1
    println("$a $b")
    
    for i in 0..10 {
        println(i)
    }
}

I want to write a Lexer for syntax coloring such files. 1. and .1 should be matched by FloatNumber fragment while the .. in the for-loop should match by a punctuation rule. The problem I have is that my FloatNumber implementation already matches 0. and .10 from the 0..10 and I have no idea how to tell it not to match if a . follows (or is in front of it). A little bit simplified (leaving possible underscores aside) my grammar looks like this:

fragment FloatNumber
    : ( Digit+   ('.' Digit*)?  ([eE]  [+-]?  Digit+)?
      | Digit*    '.' Digit+    ([eE]  [+-]?  Digit+)?
      )
    ;

fragment Digit
    : [0-9]
    ;

Solution

  • Then you will have to introduce a predicate that checks if there is no . ahead when matching a float like 1..

    The following rules:

    Plus
     : '+'
     ;
    
    FloatLiteral
     : Digit+ '.' {_input.LA(1) != '.'}?
     | Digit* '.' Digit+
     ;
    
    Int
     : Digit+
     ;
    
    Range
     : '..'
     ;
    

    given the input "1.2 .3 4. 5 6..7 8.+9", will produce the following tokens:

    FloatLiteral              `1.2`
    FloatLiteral              `.3`
    FloatLiteral              `4.`
    Int                       `5`
    Int                       `6`
    Range                     `..`
    Int                       `7`
    FloatLiteral              `8.`
    Plus                      `+`
    Int                       `9`
    

    Code inside a predicate is target specific. The predicate above ({_input.LA(1) != '.'}?) works with the Java target.