I'm working towards a deep understanding of ML-style modules: I think the concept is important and I love the kind of thinking they encourage. I am just now discovering the tension that can arise between parametric types and parametric modules. I am seeking tools for thinking about the matter that will help me make smart design decisions as I build up my programs.
Fist I will try to describe my question in general. Then I will provide a concrete example from a learning project I am working on. Finally, I will revisit the general question in order to draw it to a point.
(I'm sorry that I don't yet know enough to pose this question more succinctly.)
In general terms, the tension I've discovered is this: functions are most flexible, and open to the widest reuse, when we provide them with parametric type signatures (where appropriate). However, modules are most flexible and open to the widest reuse when we seal the parameterization of functions inside the module, and instead parameterize the entire module on a given type.
A ready example of this difference can be found in comparing modules that
implement the LIST
signature with those that implement
ORD_SET
. A module List:LIST
provides a bunch of useful functions,
parameterized on any type. Once we've defined or loaded a List
module, we can
readily apply any of the functions it provides to construct, manipulate, or
examine lists of any type. E.g., if we are working with both strings and
integers, we can use one and the same module to construct and manipulate
values of both types:
val strList = List.@ (["a","b"], ["c","d"])
val intList = List.@ ([1,2,3,4], [5,6,7,8])
On the other hand, if we want to deal with ordered sets, matters are different:
ordered sets require that an ordering relation hold over all of their elements,
and there can be no single concrete function compare : 'a * 'a -> order
producing that relation for every type. Consequently, we require a different
module satisfying the ORD_SET
signature for each type we wish to put into
ordered sets. Thus, in order to construct or manipulate ordered sets of strings
and integers, we must implement different modules for each type[1]:
structure IntOrdSet = BinarySetFn ( type ord_key = int
val compare = Int.compare )
structure StrOrdSet = BinarySetFn ( type ord_key = string
val compare = String.compare )
And we must then use the fitting function from the appropriate module when we wish to operate on a given type:
val strSet = StrOrdSet.fromList ["a","b","c"]
val intSet = IntOrdSet.fromList [1,2,3,4,5,6]
There is a pretty straightforward tradeoff here: LIST
modules provide functions
that range over any type you please, but they cannot take advantage of any relations
that hold between the values of any particular type; ORD_SET
modules provide
functions that are necessarily constrained to the type supplied in the functors
parameter, but through that same parameterization, they are capable of
incorporating specific information about the internal structure and relations of
their target types.
It is easy to imagine cases where we would want to design an alternative family of list modules, using functors to parameterize types and other values to provide list-like data types with a more complicated structure: e.g., to specify a data type for ordered list, or to represent lists using self-balancing binary search trees.
When creating a module, I think it is also fairly easy to recognize when it will be able to provided polymorphic functions and when it will need to be parameterized on some type(s). What seems more difficult, to me, is figuring out which kind of modules you should depend on when working on something further down stream.
In general terms, my question is this: when I am designing a system of variously related modules, how can I figure out whether to design around modules providing polymorphic functions or modules generated using functors parameterized on types and values?
I hope to illustrate the dilemma and why it matters with the following example, taken from a toy project I am working on.
I have a functor PostFix (ST:STACK) : CALCULATOR_SYNTAX
. This takes an
implementation of a stack data structure and produces a parser that reads
concrete postfix ("reverse polish") notation into an abstract syntax (to be
evaluated by a calculator module down stream), and vice-versa. Now, I had been
using a standard stack interface that provides a polymorphic stack type and
number of functions to operate on it:
signature STACK =
sig
type 'a stack
exception EmptyStack
val empty : 'a stack
val isEmpty : 'a stack -> bool
val push : ('a * 'a stack) -> 'a stack
val pop : 'a stack -> 'a stack
val top : 'a stack -> 'a
val popTop : 'a stack -> 'a stack * 'a
end
This works fine, and gives me some flexibility, as I can use a list-based stack
or vector-based stack, or whatever. But, say I want to add a simple logging
function to the stack module, so that every time an element is pushed to, or
popped from, the stack, it prints out the current state of the stack. Now I will
need a fun toString : 'a -> string
for the type collected by the stack, and
this, as I understand, cannot be incorporated into the STACK
module. Now I
need to seal the type into the module, and parameterize the module over the type
collected in the stack and a toString
function that will let me produce a
printable representation of the collected type. So I need something like
functor StackFn (type t
val toString: t -> string ) =
struct
...
end
and this will not produce a module matching the STACK
signature, since it
doesn't provide a polymorphic type. Thus, I must change the signature required
for the PostFix
functor. If I have a lot of other modules, I have to change
all of them as well. That could be inconvenient, but the real problem is that I
can no longer use my simple list-based, or vector-based STACK
modules in the
PostFix
functor when I don't want logging. Now, it seems, I have to go back
and rewrite those modules to have a sealed type as well.
So to return to, expand upon, and (mercifully) finish my question:
StackFn
so that they'll end up as "special cases" of STACK
?PostFix
module
that would allow for both the modules produced by StackFn
and those that
satisfy STACK
?(If you've read this far. Thank you very much!)
As you discovered, there is a tension between parametric polymorphism and functors/module in SML and OCaml. This is mostly due to the "two language" nature of modules and to the lack of ad hoc polymorphism. 1ML and modular implicits both provide different solutions to that problem. The first by unifying back the two kind of parametrism, the later by allowing to sparkle some ad-hoc polymorphism when needed.
Back to practical considerations. With functors, it's fairly easy (but verbose/annoying) to monomorphise a given datastructure. Here is an example (in OCaml). With this, you can still write generic implementations and specialize them later on (by providing a printing function).
module type POLYSTACK = sig
type 'a stack
exception EmptyStack
val empty : 'a stack
val isEmpty : 'a stack -> bool
val push : ('a * 'a stack) -> 'a stack
val pop : 'a stack -> 'a stack
val top : 'a stack -> 'a
val popTop : 'a stack -> 'a stack * 'a
val toString : ('a -> string) -> 'a stack -> string
end
module type STACK = sig
type elt
type t
exception EmptyStack
val empty : t
val isEmpty : t -> bool
val push : (elt * t) -> t
val pop : t -> t
val top : t -> elt
val popTop : t -> t * elt
val toString : t -> string
end
module type PRINTABLE = sig
type t
val toString : t -> string
end
module Make (E : PRINTABLE) (S : POLYSTACK)
: STACK with type elt = E.t and type t = E.t S.stack
= struct
type elt = E.t
type t = E.t S.stack
include S
let toString = S.toString E.toString
end
module AddLogging (S : STACK)
: STACK with type elt = S.elt and type t = S.t
= struct
include S
let push (x, s) =
let s' = S.push (x, s) in print_string (toString s') ; s'
end