I'm trying to write a small DSL parser using fslex
and fsyacc
. The input is composed of interleaving chunks of two different languages which require different lexing rules. How do I write my fslex
file to support that?
(I guess a similar case would be how to define an fslex
file for the c language but with support for inline assembly, which requires different lexing rules?)
What I have currently is something like this:
rule tokenize = parse
| "core" { core lexbuf }
...
and core = parse
| ...
The thing is, once a token gets returned by the core
parser, the next part of the input gets passed to tokenize
instead. However I want to stay (as it were) in the core
state. How do I do that?
Thanks!
I actually managed to find a solution on my own. I defined my own tokenizer function which decides based on the BufferLocalStore
state which tokenizer to call.
let mytokenizer (lexbuf : LexBuffer<char>) =
if lexbuf.BufferLocalStore.["state"].Equals("core") then FCLexer.core lexbuf
else FCLexer.tokenize lexbuf
let aString (x : string) =
let lexbuf = LexBuffer<_>.FromString x
lexbuf.BufferLocalStore.["state"] <- "fc"
let y = try (FCParser.PROG mytokenizer) lexbuf
...
And I modified my fslex
input file slightly:
rule tokenize = parse
| "core" { lexbuf.BufferLocalStore.["state"] <- "core"; core lexbuf }
...
Amazing how simply asking the question can lead you to the solution, and I hope this helps someone besides me :)