module ocaml circular-dependency modularity modular-design

Cyclic dependencies and modular design

I’ve been dealing with a design problem for quite a while now, where cyclic dependencies are the fundamental problem, and I’m having some problems resolving it elegantly. I’m coming from C, where cyclic dependencies are both possible and quite easily resolvable.

The following is a very simplified image of the files in the project which are of interest:

ast.ml (doesn’t actually have an interface, I’m not too keen on copying the whole type)

type loc = string * (int * int) * (int * int)
and id = string * loc
and decl = 
  | Decl_Func of decl_func
and decl_func = {
  df_Name: id;
  mutable df_SymTab: sym_tab option;
}
(* goes on for about 100 more types *)

symtab.mli

type t
type symbol =
  | Sym_Func of Ast.decl_func

val lookup_by_id: Ast.id -> symbol

(there are more files to be added in the future)

In C I'd simply make the symbol table a pointer, and forward declare it. Problem solved. This, unfortunately, isn't possible in OCaml.

Each of the implementations is quite large. Which means I absolutely do not want to make everything recursive modules, since that would mean the implementation file will be 10kloc or even more, with a ton of code which is not really related (beyond the big recursive type).

How would I solve this, while still maintaining a somewhat modular design?

Solution

You're not the first to have that problem and there are numerous different solutions depending on workflow, taste and needs.

Here is a good way to think about it.

1. Isolate the leaves of your AST

By leaves, I mean the types like loc or id that do not depend on any other type. They don't need to be in your recursive type definition and therefore shouldn't be.

Moreover, you'll probably have specific functions to handle locations and identifiers and having those function close to the type definition is good practice. So, you can create a ast_loc.ml and a ast_id.ml file with the appropriate definitions and basic functions.

This may seem like little, but it will actually help make your code clearer with the added bonus of lightening up ast.ml.

2. If need be, parameterize your types

Now, I do not recommend you use that extensively, as it tends to make code harder to read, as it has more indirections. Check this out:

type 't v = Thing of 't

(* potentially in a different later file *)
type t = Stuff of t v

By using a type parameter, you can delay the usage of recursivity in your type definition. Note that I do not recommend you use it for your whole AST as it will make maintaining a pain but if you have some middle nodes that behave quite independently of the rest, this may help.

These for instance, can be often used:

type 'a named = { id : id; v : 'a; }
type 'a located = { loc : loc; v: 'a; }

This method is particularly useful if it helps factorize your type definition. But, as I have already stated: don't abuse it! It is easy to do, but hard to maintain.

3. At some point, a big fat recursive definition is what you need

As of today, the Parsetree file of the OCaml compiler has 958 lines. That's what it's supposed to have. It is a complex tree structure and that should be visible.

Note that the file is just a type definition. Subsequent files contain the code to manipulate that definition (and usually don't introduce new types that are necessary outside their module).

In a way, I am a bit contradicting the point I made about loc and id arguing that you should separate type definition and code, but this is a different case: loc and id are simple types that can be manipulated independently. symbol only makes sense within your AST definition. Also, nothing keeps you from creating a symbol.ml file that manipulates that part of the AST without containing the type definition (comments are your friends, Merlin is a must).

Also, recursive functors is not something I'd advise unless you really need them.