Search code examples
ocamlpretty-printlexerocamlyaccocamllex

External definitions for ocamllex regular expressions


I have implemented the usual combination of lexer/parser/pretty-printer for reading-in/printing a type in my code. I find there is redundancy among the lexer and the pretty-printer when it comes to plain-string regular expressions, usually employed for symbols, punctuation or separators.

For example I now have

rule token = parse
  | "|-" { TURNSTILE }

in my lexer.mll file, and a function like:

let pp fmt (l,r) = 
  Format.fprintf fmt "@[%a |-@ %a@]" Form.pp l Form.pp r

for pretty-printing. If I decide to change the string for TURNSTILE, I have to edit two places in the code, which I find less than ideal.

Apparently, the OCaml lexer supports a certain ability to define regular expressions and then refer to them within the mll file. So lexer.mll could be written as

let symb_turnstile = "|-"

rule token = parse
  | symb_turnstile { TURNSTILE }

But this will not let me externally access symb_turnstile, say from my pretty-printing functions. In fact, after running ocamllex, there are no occurences of symb_turnstile in lexer.ml. I cannot even refer to these identifiers in the OCaml epilogue of lexer.mll.

Is there any way of achieving this?


Solution

  • In the end, I went for the following style which I stole from the sources of ocamllex itself (so I am guessing it's standard practice). A map from strings to tokens (here an association list) is defined in the preamble of lexer.mll

    let symbols =
      [ 
        ...
        (Symb.turnstile, TURNSTILE); 
        ...
      ]
    

    where Symb is a module defining turnstile as a string. Then, the lexing part of lexer.mll is purposely overly general:

    rule token = parse
      ...
      | punctuation
        {
          try 
            List.assoc (Lexing.lexeme lexbuf) symbols
          with Not_found -> lex_error lexbuf  
        }
      ...
    

    where punctuation is a regular expression matching a sequence of symbols.

    The pretty-printer can now be written like this.

    let pp fmt (l,r) = 
      Format.fprintf fmt "@[%a %s@ %a@]" Form.pp Symb.turnstile l Form.pp r