Search code examples
haskellghctypechecking

What cases do the GHC occurs check identify?


The GHC occurs check prevents you from constructing infinite types. Is its purpose to prevent common errors in code or to prevent the typechecker from looping indefinitely, or both?

What cases does it identify and is it possible for a malicious user to trick it (as in a Safe Haskell context) into looping? If the type system is Turing-complete (is it?) I don't understand how GHC can guarantee that the computation will halt.


Solution

  • Think of type inference as solving a system of equations. Let's look at an example:

    f x = (x,2)
    

    How can we deduce the type of f? We see that it is a function:

    f :: a -> b
    

    Additionally, from the structure of f we can see that the following equations hold simulatenously:

    b = (c,d)
    d = Int
    c = a
    

    By solving this system of equations we can see that the type of f is a -> (a, Int). Now let's look at the following (erroneous) function:

    f x = x : x
    

    The type of (:) is a -> [a] -> [a], so this generates the following system of equations (simplified):

    a = a
    a = [a]
    

    So we get an equation a = [a], from which we can conclude that this system of equations doesn't have a solution, and therefore the code is not well-typed. If we didn't reject the equation a = [a], we would indeed go in an infinite loop adding equations a = [a], a = [[a]], a = [[[a]]], etc to our system (alternatively, as Daniel notes in his answer, we could allow infinite types in our type system, but that would make erroneous programs such as f x = x : x to typecheck).

    You can also test this in ghci:

    > let f x = x : x
    
    <interactive>:2:15:
        Occurs check: cannot construct the infinite type: a0 = [a0]
        In the second argument of `(:)', namely `x'
        In the expression: x : x
        In an equation for `f': f x = x : x
    

    As to your other questions: GHC Haskell's type system is not Turing-complete and the typechecker is guaranteed to halt - unless you enable UndecidableInstances, in which case it theoretically can go in an infinite loop. GHC, however, ensures termination by having a fixed-depth recursion stack, so it's not possible to construct a program on which it never halts (edit: according to C.A.McCann's comment, it is possible after all - there's an analogue of tail recursion on the type level that allows to loop without increasing the stack depth).

    It is, however, possible to make compilation take arbitrarily long time, since the complexity of Hindley-Milner type inference is exponential in the worst (but not the average!) case:

    dup x = (x,x)
    
    bad = dup . dup . dup . dup . dup . dup . dup . dup
          . dup . dup . dup . dup . dup . dup . dup . dup
          . dup . dup . dup . dup . dup . dup . dup . dup
          . dup . dup . dup . dup . dup . dup . dup . dup
    

    Safe Haskell won't protect you from this - take a look at mueval if you want to allow potentially malicious users compile Haskell programs on your system.