Search code examples
ocamlmenhir

Menhir: --external-tokens can't seem to find Tokens module


I have a tokens.ml file which has a type token statement in it. I also have a tokens.mli with the same type token statement. Now, I have a parser.mly which uses the tokens from tokens.mly. I want to keep my tokens in tokens.ml/mli and my parser in parser.mly.

So, I tried compiling my parser using the command

menhir parser.mly --table --explain --external-tokens Tokens

This gives me an error saying one of my tokens does not exist. Specifically,

File "parser.mly", line 173, characters 4-12:
Error: OPERATOR is undefined.

So, menhir is not finding the Tokens module. I don't know how to make it visible to menhir. I tried making a tokens.cma library, but even then I still get the same error.

Menhir doesn't seem to care if the module doesn't exist, because if I run the command

menhir parser.mly --table --explain --external-tokens SomeNonExistentModule

It still gives the same error about OPERATOR being undefined.

How do I get Menhir to find my tokens module. I would prefer to not use ocamlbuild. If you suggest an ocamlbuild solution, please at least explain the intermediate manual steps I could do instead. I want to understand what Menhir expects.


Solution

  • As discussed in the comments, the error arises because you do not have the

    %token OPERATOR
    

    declaration in your .mly file.

    Menhir's --external-tokens T option exists to have the generated parser module use T.token instead of generating the token type from the declarations, however the declarations are still necessary inside the .mly file.

    As a side note, you can have your tokens in a separate .mly file (e.g. tokens.mly), which will look like below:

    tokens.mly:

    %token <int> INT
    %token EOF
    %%
    

    parser.mly:

    %start <int> f
    %%
    
    f : n = INT; EOF { n }
    

    and then you can run the following commands:

    menhir tokens.mly --only-tokens
    menhir parser.mly tokens.mly --external-tokens Tokens --base parser
    

    which can be useful if you want to reuse the tokens across parsers etc.

    (you can also skip the --only-tokens bit and write tokens.ml by hand, to be consistent with tokens.mly)