How to decide whether to parameterize on the type-level or the module-level when designing modules?

I'm working towards a deep understanding of ML-style modules: I think the concept is important and I love the kind of thinking they encourage. I am just now discovering the tension that can arise between parametric types and parametric modules. I am seeking tools for thinking about the matter that will help me make smart design decisions as I build up my programs.

Fist I will try to describe my question in general. Then I will provide a concrete example from a learning project I am working on. Finally, I will revisit the general question in order to draw it to a point.

(I'm sorry that I don't yet know enough to pose this question more succinctly.)

In general terms, the tension I've discovered is this: functions are most flexible, and open to the widest reuse, when we provide them with parametric type signatures (where appropriate). However, modules are most flexible and open to the widest reuse when we seal the parameterization of functions inside the module, and instead parameterize the entire module on a given type.

A ready example of this difference can be found in comparing modules that implement the LIST signature with those that implement ORD_SET. A module List:LIST provides a bunch of useful functions, parameterized on any type. Once we've defined or loaded a List module, we can readily apply any of the functions it provides to construct, manipulate, or examine lists of any type. E.g., if we are working with both strings and integers, we can use one and the same module to construct and manipulate values of both types:

val strList = List.@ (["a","b"], ["c","d"])
val intList = List.@ ([1,2,3,4], [5,6,7,8])

On the other hand, if we want to deal with ordered sets, matters are different: ordered sets require that an ordering relation hold over all of their elements, and there can be no single concrete function compare : 'a * 'a -> order producing that relation for every type. Consequently, we require a different module satisfying the ORD_SET signature for each type we wish to put into ordered sets. Thus, in order to construct or manipulate ordered sets of strings and integers, we must implement different modules for each type[1]:

structure IntOrdSet = BinarySetFn ( type ord_key = int
                                    val compare = Int.compare )
structure StrOrdSet = BinarySetFn ( type ord_key = string
                                    val compare = String.compare )

And we must then use the fitting function from the appropriate module when we wish to operate on a given type:

val strSet = StrOrdSet.fromList ["a","b","c"]
val intSet = IntOrdSet.fromList [1,2,3,4,5,6]

There is a pretty straightforward tradeoff here: LIST modules provide functions that range over any type you please, but they cannot take advantage of any relations that hold between the values of any particular type; ORD_SET modules provide functions that are necessarily constrained to the type supplied in the functors parameter, but through that same parameterization, they are capable of incorporating specific information about the internal structure and relations of their target types.

It is easy to imagine cases where we would want to design an alternative family of list modules, using functors to parameterize types and other values to provide list-like data types with a more complicated structure: e.g., to specify a data type for ordered list, or to represent lists using self-balancing binary search trees.

When creating a module, I think it is also fairly easy to recognize when it will be able to provided polymorphic functions and when it will need to be parameterized on some type(s). What seems more difficult, to me, is figuring out which kind of modules you should depend on when working on something further down stream.

In general terms, my question is this: when I am designing a system of variously related modules, how can I figure out whether to design around modules providing polymorphic functions or modules generated using functors parameterized on types and values?

I hope to illustrate the dilemma and why it matters with the following example, taken from a toy project I am working on.

I have a functor PostFix (ST:STACK) : CALCULATOR_SYNTAX. This takes an implementation of a stack data structure and produces a parser that reads concrete postfix ("reverse polish") notation into an abstract syntax (to be evaluated by a calculator module down stream), and vice-versa. Now, I had been using a standard stack interface that provides a polymorphic stack type and number of functions to operate on it:

signature STACK =
sig
    type 'a stack
    exception EmptyStack

    val empty : 'a stack
    val isEmpty : 'a stack -> bool

    val push : ('a * 'a stack) -> 'a stack
    val pop  : 'a stack -> 'a stack
    val top  : 'a stack -> 'a
    val popTop : 'a stack -> 'a stack * 'a
end

This works fine, and gives me some flexibility, as I can use a list-based stack or vector-based stack, or whatever. But, say I want to add a simple logging function to the stack module, so that every time an element is pushed to, or popped from, the stack, it prints out the current state of the stack. Now I will need a fun toString : 'a -> string for the type collected by the stack, and this, as I understand, cannot be incorporated into the STACK module. Now I need to seal the type into the module, and parameterize the module over the type collected in the stack and a toString function that will let me produce a printable representation of the collected type. So I need something like

functor StackFn (type t
                 val toString: t -> string ) =
struct
   ...
end

and this will not produce a module matching the STACK signature, since it doesn't provide a polymorphic type. Thus, I must change the signature required for the PostFix functor. If I have a lot of other modules, I have to change all of them as well. That could be inconvenient, but the real problem is that I can no longer use my simple list-based, or vector-based STACK modules in the PostFix functor when I don't want logging. Now, it seems, I have to go back and rewrite those modules to have a sealed type as well.

So to return to, expand upon, and (mercifully) finish my question:

Is there some way of specifying the signature of the modules produced by StackFn so that they'll end up as "special cases" of STACK?
Alternatively, is there a way of writing a signature for the PostFix module that would allow for both the modules produced by StackFn and those that satisfy STACK?
Generally speaking, is there a way of thinking about the relation between modules that would help me catch/anticipate this kind of thing in the future?

(If you've read this far. Thank you very much!)

Solution

As you discovered, there is a tension between parametric polymorphism and functors/module in SML and OCaml. This is mostly due to the "two language" nature of modules and to the lack of ad hoc polymorphism. 1ML and modular implicits both provide different solutions to that problem. The first by unifying back the two kind of parametrism, the later by allowing to sparkle some ad-hoc polymorphism when needed.

Back to practical considerations. With functors, it's fairly easy (but verbose/annoying) to monomorphise a given datastructure. Here is an example (in OCaml). With this, you can still write generic implementations and specialize them later on (by providing a printing function).

module type POLYSTACK = sig
  type 'a stack
  exception EmptyStack

  val empty : 'a stack
  val isEmpty : 'a stack -> bool

  val push : ('a * 'a stack) -> 'a stack
  val pop  : 'a stack -> 'a stack
  val top  : 'a stack -> 'a
  val popTop : 'a stack -> 'a stack * 'a
  val toString : ('a -> string) -> 'a stack -> string
end

module type STACK = sig
  type elt
  type t
  exception EmptyStack

  val empty : t
  val isEmpty : t -> bool

  val push : (elt * t) -> t
  val pop  : t -> t
  val top  : t -> elt
  val popTop : t -> t * elt
  val toString : t -> string
end

module type PRINTABLE = sig
  type t
  val toString : t -> string
end

module Make (E : PRINTABLE) (S : POLYSTACK)
  : STACK with type elt = E.t and type t = E.t S.stack
= struct
  type elt = E.t
  type t = E.t S.stack
  include S
  let toString = S.toString E.toString
end

module AddLogging (S : STACK)
  : STACK with type elt = S.elt and type t = S.t
= struct
  include S
  let push (x, s) =
    let s' = S.push (x, s) in print_string (toString s') ; s'
end