Search code examples
cocamlprogramming-languagesmutual-recursion

Why aren't there function headers in OCaml?


In certain programming languages, most notably in C, there are header files with function declarations. These function "headers" come before the code, and are required for situations with mutual recursion. When the function headers are placed in a header file, they also help with linking situations where multiple C files are compiled together.

My understanding of function headers within a C file is that they help with compilation because they define the typing of the function if it is called before it is defined. If this is wrong, I would be happy to stand corrected and better informed, but this is my understanding of it.

So what I don't understand, is why other languages - in this case I'm singling out OCaml - don't have function headers. In OCaml, there modules with signatures, but a signature does not allow your instantiations of functions to be mutually recursive, even though the typings are given. In order to have mutual recursion in OCaml, you need to use either the "and" keyword, or defined one function locally within the other function (to the best of my knowledge).

(* version 1 *)
let rec firstfun = function
  | 0 -> 1
  | x -> secondfun (x - 1)
and secondfun = function
  | 0 -> 1
  | x -> firstfun x

(* version 2 *)
let rec firstfun = function
  | 0 -> 1
  | x -> 
       let rec secondfun = function
         |0 -> 1
         | x -> firstfun x in
       secondfun (x - 1)

So why is this the case? Is it related to polymorphism? Is there a compilation bottleneck I'm not considering?


Solution

  • In fact, this is not about headers, as they are just pieces of code, that are copy-pasted into files using the preprocessor. The root of your question is why we need a full definition of an external value, in order to compile a unit. For example, in C, you can freely use any external function, given that you provided the prototype, and in older versions of C you don't even need the prototype.

    The reason why it works is because a compilation unit, produced by a C compiler, will contain unresolved symbols (e.g., functions or data values defined somewhere else), and it is the job of a linker to connect those symbols to their respective implementations.

    OCaml, underneath the hood, uses a C-like format for compilation units and is actually relying on a system C linker to produce an executable. Thus there are no practical or theoretical limitations so far.

    So what is the problem? As long, as you can provide a type of an external symbol, you can easily use it. And here is the main problem -- if there is a mutual dependency (on a type level) between two compilation units (i.e., files), the OCaml typechecker can't infer the type. This limitation is mostly technical, as the typechecker works on a compilation unit level. The reason for this limitation is mostly historical, as at the time when OCaml compiler was written the separate compilation was trendy, and thus a decision was made to apply the typechecker on a per file basis.

    To summarize, if you're able to provide types that OCaml type checker will accept, then you can easily induce a mutual recursion between your compilation units. To move the typechecker from your way, you can use different mechanisms, e.g., the Obj module, or external definitions. Of course, in that case, you're taking the whole responsibility on the soundness of your type definitions. Basically, the same as in C language.

    Since you can always produce a whole-program compilation system from a compiler, that supports the separate compilation, by just concatenating all modules in one big module, the alternative solution would be to use some sort of a preprocessor that will collect all the definitions, and copy them into a single file (i.e., one compilation unit), and then apply the compiler to it.