What is the type of the empty list in ML?

I was wondering what exactly the type of the empty list is in SML (I'm using PolyML)? When I type [] into the interpreter, I get out:

val it = []: 'a list

which is what I would expect. But then, if I were to type in say:

fun f a = if a = 0 then [] else [[]];

then I get:

val f = fn: int -> 'a list list

which would seem to imply that [] has the type 'a list list also. How exactly does this work?

Solution

In ML terminology, a value has a type scheme, not a type. A type scheme is a collection of types. It's a type with wildcards, and the wildcards can be replaced by any type. Replacing the wildcards by types is called instantiating the type scheme.

In 'a list, 'a is a type variable, i.e. a wildcard. The empty list [] can be used at any type that is an instance of the type scheme 'a list. That is, for any type T, the empty list has the type T list. For example, the empty list has the type int list and the type bool list and the type (int * int * int) list and the type int list list and so on. All of these types are instances of the type scheme 'a list.

When a variable occurs multiple times, it must be replaced consistently. For example fn x => x (the identity function) has the type scheme 'a -> 'a; it has the type int -> int and the type bool -> bool but not the type int -> bool. A value with the type scheme 'a -> 'b would have the type int -> bool as well.

A value can have multiple types — this is a common phenomenon in programming languages. It is a property of the core ML language that the set of types of a well-typed value is exactly the set of instances of a type scheme. This property is called principality: the type system of ML is principal. Concrete implementations of ML tend to have exceptions to principality; for example SML has an exception due to the overloading of operators (fn x => x + x has the two types int -> int and float -> float, and the overloading resolution rules cause the compiler to decide on int -> int if the context does not impose float).

A value whose principal type scheme contains at least one variable is said to be polymorphic. A value whose principal type scheme does not contain variables is said to be monomorphic.

A wildcard can be replaced by a type scheme. This yields another type scheme which is a subset of the original collection, i.e. the type scheme is a smaller type scheme (a refinement) of the original. For example the value [[]] has the type scheme 'a list list: it has the type T list list for any type T. [[]] has the type int list list and the type bool list list and the type (int * int * int) list list etc.

In literature about ML, “type” is often used interchangeably with “type scheme”, so it's usual to say that “the empty list has the type 'a list”. In literature about type systems, “type” often means what ML terminology calls “type scheme”, and other terms such as “ground type” are used for a type without variables.