Search code examples
f#fslex

fslex - How to switch between two token sets?


I'm trying to write a small DSL parser using fslex and fsyacc. The input is composed of interleaving chunks of two different languages which require different lexing rules. How do I write my fslex file to support that?

(I guess a similar case would be how to define an fslex file for the c language but with support for inline assembly, which requires different lexing rules?)

What I have currently is something like this:

rule tokenize = parse
    | "core"        { core lexbuf }
    ...

and core = parse
    | ...

The thing is, once a token gets returned by the core parser, the next part of the input gets passed to tokenize instead. However I want to stay (as it were) in the core state. How do I do that?

Thanks!


Solution

  • I actually managed to find a solution on my own. I defined my own tokenizer function which decides based on the BufferLocalStore state which tokenizer to call.

    let mytokenizer (lexbuf : LexBuffer<char>) =
        if lexbuf.BufferLocalStore.["state"].Equals("core") then FCLexer.core lexbuf
        else FCLexer.tokenize lexbuf
    
    let aString (x : string) = 
        let lexbuf = LexBuffer<_>.FromString x
        lexbuf.BufferLocalStore.["state"] <- "fc"
        let y = try (FCParser.PROG mytokenizer) lexbuf
    ...
    

    And I modified my fslex input file slightly:

    rule tokenize = parse
        | "core"        { lexbuf.BufferLocalStore.["state"] <- "core"; core lexbuf }
    ...
    

    Amazing how simply asking the question can lead you to the solution, and I hope this helps someone besides me :)