Search code examples
ocamlppx

OCaml specify path to ppx executable


I'm trying to figure out how to pass in the location of an executable to be run as a ppx filter for the OCaml compilers ocamlc/ocamlopt.

My questions are, basically

  • What format is a ppx filter expected to take as input?
  • What is it expected to produce?
  • In the case of cppo specifically, how do you configure it to accept the required format and emit the required format?
  • Why doesn't cat work as the "identity filter" and produce the same result using no filter at all?

For instance, here's a simple OCaml program.

(* foo.ml *)
let hi = ();;

Printf.printf "hi there\n"

Using an utterly trivial filter cat, we can see what sort of thing the filter is expected to handle as input.

$ ocamlopt -ppx cat foo.ml | cat -v
Caml1999M019M-^DM-^U??^@^@^@^G^@^@^@^A^@^@^@^C^@^@^@^B&foo.m
...
File "foo.ml", line 1:
Error: External preprocessor does not produce a valid file

It looks like some kind of binary format, maybe an AST representation?

cppo, with no options to configure it, processes textual OCaml source files when invoked with explicit files. I'm a little confused why it emits #line directives when invoked this way ... I believe these directives have meaning to a C compiler but not an OCaml compiler.

For example:

$ cppo foo.ml
# 1 "foo.ml"
let hi = ();;

Printf.printf "hi there\n"

And it works when invoked as a filter, supplying a file name of <stdin> to line directives.

$ cat foo.ml | cppo
# 1 "<stdin>"
let hi = ();;

Printf.printf "hi there\n"

Reading the --help for cppo, there's no mention of input and output formats or an argument like -ppx, which is somewhat disappointing.

What are you supposed to do to stitch all the pieces together?


Solution

  • The Ocaml compiler supports two distinct families of preprocessor: textual preprocessors invoked with the -pp option and (binary) AST preprocessors invoked with the -ppx option.

    Textual preprocessor are expected to take as an input the name of a textual source file and outputs on stdout either an OCaml source file or an OCaml binary AST. For those, cat does implement the identity map: ocamlc -pp catocamlc . An important limitation of those preprocessor is that they cannot be chained together.

    Contrarily, -ppx AST preprocessor takes as an input the name of the input AST binary file and the name of the output binary AST file. In other words, in this case the identity map can be implemented with cp: ocamlc -ppx cpocamlc. Since both the input and output of these preprocessor are binary ASTs, multiple ppx preprocessor can be chained together.

    The cppo preprocessor does belong to the first family and needs to be invoked with -pp