Search code examples
haskelltype-systemscategory-theorycoinduction

Are codatatypes really terminal algebras?


(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).

Consider the "category of types", something like Hask but with whatever adjustment that fits the discussion. Within such a category, it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.

I'm struggling to convince myself of (2).

Consider the functor T(t) = 1 + a * t. I agree that the initial T-algebra is well-defined and indeed defines [a], the list of a. By definition, the initial T-algebra is a type X together with a function f :: 1+a*X -> X, such that for any other type Y and function g :: 1+a*Y -> Y, there is exactly one function m :: X -> Y such that m . f = g . T(m) (where . denotes the function combination operator as in Haskell). With f interpreted as the list constructor(s), g the initial value and the step function, and T(m) the recursion operation, the equation essentially asserts the unique existance of the function m given any initial value and any step function defined in g, which necessitates an underlying well-behaved fold together with the underlying type, the list of a.

For example, g :: Unit + (a, Nat) -> Nat could be () -> 0 | (_,n) -> n+1, in which case m defines the length function, or g could be () -> 0 | (_,n) -> 0, then m defines a constant zero function. An important fact here is that, for whatever g, m can always be uniquely defined, just as fold does not impose any contraint on its arguments and always produce a unique well-defined result.

This does not seem to hold for terminal algebras.

Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.

I agree that this is sometimes true. For example, when g :: Unit + (Unit, Int) -> Int is defined as () -> 0 | (_,n) -> n+1 like before, m then behaves such that m(0) = () and m(n+1) = Cons () m(n). For non-negative n, m(n) should be a finite list of units. For any negative n, m(n) should be of infinite length. It can be verified that the equation above holds for such g and m.

With any of the two following modified definition of g, however, I don't see any well-defined m anymore.

First, when g is again () -> 0 | (_,n) -> n+1 but is of type g :: Unit + (Bool, Int) -> Int, m must satisfy that m(g((b,i))) = Cons b m(g(i)), which means that the result depends on b. But this is impossible, because m(g((b,i))) is really just m(i+1) which has no mentioning of b whatsoever, so the equation is not well-defined.

Second, when g is again of type g :: Unit + (Unit, Int) -> Int but is defined as the constant zero function g _ = 0, m must satisfy that m(g(())) = Nil and m(g(((),i))) = Cons () m(g(i)), which are contradictory because their left hand sides are the same, both being m(0), while the right hand sides are never the same.

In summary, there are T-algebras that have no morphism into the supposed terminal T-algebra, which implies that the terminal T-algebra does not exist. The theoretical modeling of the codatatype Stream (or infinite list), if any, cannot be based on the nonexistant terminal algebra of the functor T(t) = 1 + a * t.

Many thanks to any hint of any flaw in the story above.


Solution

  • (2) terminal algebras define codatatypes.

    This is not right, codatatypes are terminal coalgebras. For your T functor, a coalgebra is a type x together with f :: x -> T x. A T-coalgebra morphism between (x1, f1) and (x2, f2) is a g :: x1 -> x2 such that fmap g . f1 = f2 . g. Using this definition, the terminal T-algebra defines the possibly infinite lists (so-called "colists"), and the terminality is witnessed by the unfold function:

    unfold :: (x -> Unit + (a, x)) -> x -> Colist a
    

    Note though that a terminal T-algebra does exist: it is simply the Unit type together with the constant function T Unit -> Unit (and this works as a terminal algebra for any T). But this is not very interesting for writing programs.