Search code examples
flex-lexerjison

Jison Lex without white spaces


I have this Jison lexer and parser:

%lex
%%

\s+              /* skip whitespace */
'D01'            return 'D01'
[xX][+-]?[0-9]+  return 'COORD'
<<EOF>>          return 'EOF'
.                return 'INVALID'

/lex

%start source
%%

source
: command EOF;

command
: D01 COORD;

It will tokenize and parse D01 X45 but not D01X45. What am I missing?


Solution

  • Unlike (f)lex -- or, indeed, the vast majority of scanner generators, jison scanners do not implement the longest-match rule. Instead, the first matching pattern wins.

    In order to make this work for keywords, jison scanners also implement the restriction that simple literal strings -- like "D01" -- only match if they end on a word-boundary.

    The workaround is to enclose the literal string pattern with redundant parentheses:

    ("D01")       { return 'D01'; }
    

    This is documented in the jison wiki