Search code examples
haskellfunctional-programmingiohigher-order-functionspurely-functional

How to view higher-order functions and IO-actions from a mathematical perspective?


I am trying to understand functional programming from first principles, yet I am stuck on the interface between the pure functional world and the impure real world that has state and side effects. From a mathematical perspective,

  • what is a function that returns a function?
  • what is a function that returns an IO action (like Haskell's IO type)?

To elaborate: In my understanding, a pure function is a map from domain to co-domain. Ultimately, it is a map from some values in computer memory to some other values in memory. In a functional language, functions are defined declaratively; i.e., they describe the mapping but not the actual computation that needs to be performed on a specific input value; the latter is up to the compiler to derive. In a simplified setting with memory to spare, there would be no computation at runtime; instead, the compiler could create a lookup table for each function already at compile time. Executing a pure program would amount to table lookup. Composing functions thus amounts to building higher-dimensional lookup tables. Of course, the entire point in having computers is to devise ways to specify functions without the need for point-wise table lookup - but I find the mental model helpful to distinguish between pure functions and effects. However, I have difficulty adapting this mental model for higher-order functions:

  • For a function that takes another function as argument, what is the resulting first-order function that maps values to values? Is there a mathematical description for it (I'm sure there is, but I am neither a mathematician nor a computer scientist).
  • How about a function that returns a function? How can I mentally "flatten" this construct to again get a first-order function that maps values to values?

Now to the nasty real world. Interaction with it is not pure, yet without it, there are no sensible programs. In my simplified mental model above, separating pure and impure parts of a program means that the basis of each functional program is a layer of imperative statements that get data from the real world, apply a pure function to it (do table lookup), and then write the result back to the real world (to disk, to the screen, the network, etc.).

In Haskell, this imperative interaction with the real world is abstracted as IO actions, which the compiler sequences according to their data dependency. However, we do not write a program directly as a sequence of imperative IO actions. Instead, there are functions that return IO actions (functions of type :: IO a). But to my understanding, these cannot be real functions. What are they? How to best think about them in terms of the mental model outlined above?


Solution

  • Mathematically, there's no problem at all with functions that take or return other functions. The standard set-theory definition of a function from set S to set T is just:

    f ∈ S → T means that f ⊂ S ✕ T and two conditions hold:

    1. If s ∈ S, then (s, t) ∈ f for some t, and
    2. if both (s, t) ∈ f and (s, t') ∈ f, then t = t'.

    We write f(s) = t as a convenient notational shorthand for (s, t) ∈ f.

    So writing S → T just denotes a specific set, and therefore (A → B) → C and A → (B → C) are again just specific sets.

    Of course, for efficiency, we do not represent functions internally in memory as the set of input-output pairs like this, but this is a decent first approximation that you can use if you want a mathematical intuition. (The second approximation takes a lot more work to set up properly, because it uses structures you probably haven't already experienced very much to deal with laziness and recursion in a careful, principled way.)

    IO actions are a bit trickier. How you want to think of them may depend a bit on your particular mathematical bent.

    One persuasion of mathematician might like to define IO actions as an inductive set, something like:

    • If x :: a, then pure x :: IO a.
    • If f :: a -> b, then fmap f :: IO a -> IO b.
    • If x :: IO a and f :: a -> IO b, then x >>= f :: IO b.
    • putStrLn :: String -> IO ()
    • forkIO :: IO a -> IO ThreadId
    • ...and a thousand other base cases.
    • We quotient over a few equalities:
      • fmap id = id
      • fmap f . fmap g = fmap (f . g)
      • pure x >>= f = f x
      • x >>= pure . f = fmap f x
      • (and a slightly complicated-to-read one that just says that >>= is associative)

    In terms of defining the meaning of a program, that's enough to specify what "values" the IO family of types can hold. You might recognize this style of definition from the standard way of defining natural numbers:

    • Zero is a natural number.
    • If n is a natural number, then Succ(n) is a natural number.

    Of course, there are some things about this way of defining things that aren't super satisfying. Like: what does any particular IO action mean? Nothing in this definition says anything about that. (Though see "Tackling the Awkward Squad" for an elucidation of how you could say what an IO action means even if you take this kind of inductive definition of the type.)

    Another kind of mathematician might like this kind of definition better:

    An IO action is isomorphic to a stateful function on a phantom token representing the current state of the universe:

    IO a ~= RealWorld -> (RealWorld, a)
    

    There are attractions to this kind of definition, too; though, notably, it gets a lot harder to say what the heck forkIO does with that kind of definition.

    ...or you could take the GHC definition, in which case an IO a is secretly an a if you dig under the covers enough. But, shhh!!, don't tell the inexperienced programmers who just want to escape IO and write an IO a -> a function because they don't understand how to program using the IO interface yet!