Search code examples
regexunicodef#fslex

How to specify unicode characters in patterns for fslex


What is the correct way to specify Unicode characters in pattern for FSharp Lexer. Following code is not compiled with the FsLex.exe utility:

let lexeme lexbuf = LexBuffer<char>.LexemeString lexbuf
...
rule tokenize = parse   

| ['a'-'z' 'A'-'Z'] { TOKEN1 }  
| [\u0100\u0101]    { TOKEN2 } 
| [\u0102-\u01FF]   { TOKEN3 }  
...
| [eof]             { EOF }

What I'm doing wrong?

P.S: I'm using fslex.exe with --unicode option

Thanks, Vitaliy


Solution

  • I think you have to put the unicode characters in single quotes, just like in normal F# code.

    At least that seems to work for a small example that I tested.