Search code examples
haskellocamlocaml-core

Breakpoints in the argument-passing scheme of OCaml


Today, I was going through the source code of Jane Street's Core_kernel module and I came across the compose function:

(* The typical use case for these functions is to pass in functional arguments
   and get functions as a result. For this reason, we tell the compiler where
   to insert breakpoints in the argument-passing scheme. *)
let compose f g = (); fun x -> f (g x)

I would have defined the compose function as:

let compose f g x = f (g x)

The reason they give for defining compose the way they did is “because compose is a function which takes functions f and g as arguments and returns the function fun x -> f (g x) as a result, they defined compose the way they did to tell the compiler to insert a breakpoint after f and g but before x in the argument-passing scheme.”

So I have two questions:

  1. Why do we need breakpoints in the argument-passing scheme?
  2. What difference would it make if we defined compose the normal way?

Coming from Haskell, this convention doesn't make any sense to me.


Solution

  • This is an efficiency hack to avoid the cost of a partial application in the expected use case indicated in the comment.

    OCaml compiles curried functions into fixed-arity constructs, using a closure to partially apply them where necessary. This means that calls of that arity are efficient - there's no closure construction, just a function call.

    There will be a closure construction within compose for fun x -> f (g x), but this will be more efficient than the partial application. Closures generated by partial application go through a wrapper caml_curryN which exists to ensure that effects occur at the correct time (if that closure is itself partially applied).

    The fixed arity that the compiler chooses is based on a simple syntactic analysis - essentially, how many arguments are taken in a row without anything in between. The Jane St. programmers have used this to select the arity that they desire by injecting () "in between" arguments.

    In short, let compose f g x = f (g x) is a less desirable definition because it would result in the common two-argument case of compose f g being a more expensive partial application.

    Semantically, of course, there is no difference at all.