Search code examples
ocamlocamllex

OCamlLex case-insenstitive


Is there a way to have case in-sensitive token in Ocamllex specification? I already tried to make case in-sensitive token in this way:

let token = parser
    ...
   | ['C''c']['A''a']['S''s']['E''e'] { CASE }
    ...

but I'm searching for something else, if exists.


Solution

  • Use an ordinary token lexer that accepts both lower- and upper-case, and look up keywords in a table, ignoring case:

    {
    type token = Case | Test | Ident of string
    
    let keyword_tbl = Hashtbl.create 64
    
    let _ = List.iter (fun (name, keyword) ->
        Hashtbl.add keyword_tbl name keyword) [
        "case", Case;
        "test", Test;
      ]
    }
    
    let ident_char = ['a'-'z' 'A'-'Z' '_']
    
    rule next_token = parse
      | ident_char+ as s {
          let canon = String.lowercase s in
          try Hashtbl.find keyword_tbl canon
          with Not_found ->
            (* `Ident canon` if you want case-insensitive vars as well
             * as keywords *)
            Ident s
        }