Search code examples
haskellexceptionio-monadalternative-functor

Why is there difference between throw and throwIO?


I am trying to get a firm grasp of exceptions, so that I can improve my conditional loop implementation. To this end, I am staging various experiments, throwing stuff and seeing what gets caught.

This one surprises me to no end:

% cat X.hs
module Main where

import Control.Exception
import Control.Applicative

main = do
    throw (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc X.hs && ./X
...
X: user error (I am an IO error.)
% cat Y.hs
module Main where

import Control.Exception
import Control.Applicative

main = do
    throwIO (userError "I am an IO error.") <|> print "Odd error ignored."
% ghc Y.hs && ./Y
...
"Odd error ignored."

I thought that the Alternative should ignore exactly IO errors. (Not sure where I got this idea from, but I certainly could not offer a non-IO exception that would be ignored in an Alternative chain.) So I figured I can hand craft and deliver an IO error. Turns out, whether it gets ignored depends on the packaging as much as the contents: if I throw an IO error, it is somehow not anymore an IO error.

I am completely lost. Why does it work this way? Is it intended? The definitions lead deep into the GHC internal modules; while I can more or less understand the meaning of disparate fragments of code by themselves, I am having a hard time seeing the whole picture.

Should one even use this Alternative instance if it is so difficult to predict? Would it not be better if it silenced any synchronous exception, not just some small subset of exceptions that are defined in a specific way and thrown in a specific way?


Solution

  • Say you have

    x :: Integer
    

    That means that x should be an integer, of course.

    x = throw _whatever
    

    What does that mean? It means that there was supposed to be an Integer, but instead there’s just a mistake.

    Now consider

    x :: IO ()
    

    That means x should be an I/O-performing program that returns no useful value. Remember, IO values are just values. They are values that just happen to represent imperative programs. So now consider

    x = throw _whatever
    

    That means that there was supposed to be an I/O-performing program there, but there is instead just a mistake. x is not a program that throws an error—there is no program. Regardless of whether you’ve used an IOError, x isn’t a valid IO program. When you try to execute the program

    x <|> _whatever
    

    You have to execute x to see whether it throws an error. But, you can’t execute x, because it’s not a program—it’s a mistake. Instead, everything explodes.

    This differs significantly from

    x = throwIO _whatever
    

    Now x is a valid program. It is a valid program that always happens to throw an error, but it’s still a valid program that can actually be executed. When you try to execute

    x <|> _whatever
    

    now, x is executed, the error produced is discarded, and _whatever is executed in its place. You can also think of there being a difference between computing a program/figuring out what to execute and actually executing it. throw throws the error while computing the program to execute (it is a "pure exception"), while throwIO throws it during execution (it is an "impure exception"). This also explains their types: throw returns any type because all types can be "computed", but throwIO is restricted to IO because only programs can be executed.

    This is further complicated by the fact that you can catch the pure exceptions that occur while executing IO programs. I believe this is a design compromise. From a theoretical perspective, you shouldn't be able to catch pure exceptions, because their presence should always be taken to indicate programmer error, but that can be rather embarrassing, because then you can only handle external errors, while programmer errors cause everything to blow up. If we were perfect programmers, that would be fine, but we aren't. Therefore, you are allowed to catch pure exceptions.

    is :: [Int]
    is = []
    
    -- fails, because the print causes a pure exception
    -- it was a programmer error to call head on is without checking that it,
    -- in fact, had a head in the first place
    -- (the program on the left is not valid, so main is invalid)
    main1 = print (head is) <|> putStrLn "Oops"
    -- throws exception
    
    -- catch creates a program that computes and executes the program print (head is)
    -- and catches both impure and pure exceptions
    -- the program on the left is invalid, but wrapping it with catch
    -- makes it valid again
    -- really, that shouldn't happen, but this behavior is useful
    main2 = print (head is) `catch` (\(_ :: SomeException) -> putStrLn "Oops")
    -- prints "Oops"