Search code examples
haskelltypesdependent-typetheorem-proving

How to use a refutation to direct the type checker in Haskell?


Does filling the hole in the following program necessarily require non-constructive means? If yes, is it still the case if x :~: y decidable?

More generally, how do I use a refutation to guide the type checker?

(I am aware that I can work around the problem by defining Choose as a GADT, I'm asking specifically for type families)

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}

module PropositionalDisequality where

import Data.Type.Equality
import Data.Void

type family Choose x y where
  Choose x x = 1
  Choose x _ = 2

lem :: (x :~: y -> Void) -> Choose x y :~: 2
lem refutation = _

Solution

  • If you try hard enough to implement a function, you can convince yourself that it is not possible. If you're not convinced, the argument can be made more formal: we enumerate programs exhaustively to find that none is possible. It turns out there's only half a dozen of meaningful cases to consider.

    I wonder why this argument is not made more often.

    Totally not accurate summary:

    • Act I: proof search is easy.
    • Act II: dependent types too.
    • Act III: Haskell is still fine for writing dependently typed programs.

    I. The proof search game

    First we define the search space.

    We can reduce any Haskell definition to one of the form

    lem = (exp)
    

    for some expression (exp). Now we only need to find a single expression.

    Look at all possible ways of making an expression in Haskell: https://www.haskell.org/onlinereport/haskell2010/haskellch3.html#x8-220003 (this doesn't account for extensions, exercise for the reader).

    It fits a page in a single column, so it's not that big to start with. Moreover, most of them are sugar for some form of function application or pattern-matching; we can also desugar away type classes with dictionary passing, so we're left with a ridiculously small lambda calculus:

    • lambdas, \x -> ...
    • pattern-matching, case ... of ...
    • function application, f x
    • constructors, C (including integer literals)
    • constants, c (for primitives that cannot be written in terms of the constructs above, so various built-ins (seq) and maybe FFI if that counts)
    • variables (bound by lambdas and cases)

    We can exclude every constant on the grounds that I think the question is really about pure lambda calculus (or the reader can enumerate the constants, to exclude black-magic constants like undefined, unsafeCoerce, unsafePerformIO that make everything collapse (any type is inhabited and, for some of those, the type system is unsound), and to be left with white-magic constants to which the present theoretical argument can be generalized via a well-funded thesis).

    We can also reasonably assume that we want a solution with no recursion involved (to get rid of noise like lem = lem, and fix if you felt like you couldn't part with it before), and which actually has a normal form, or preferably, a canonical form with respect to βη-equivalence. In other words, we refine and examine the set of possible solutions as follows.

    • lem :: _ -> _ has a function type, so we can assume WLOG that its definition starts with a lambda:

      -- Any solution
      lem = (exp)
      
      -- is η-equivalent to a lambda
      lem = \refutation -> (exp) refutation
      
      -- so let's assume we do have a lambda
      lem = \refutation -> _hole
      

    Now enumerate what could be under the lambda.

    • It could be a constructor, which then has to be Refl, but there is no proof that Choose x y ~ 2 in the context (here we could formalize and enumerate the type equalities the typechecker knows about and can derive, or make the syntax of coercions (proofs of equalities) explicit and keep playing this proof search game with them), so this doesn't type check:

      lem = \refutation -> Refl
      

      Maybe there is some way of constructing that equality proof, but then the expression would start with something else, which is going to be another case of the proof.

    • It could be some application of a constructor C x1 x2 ..., or the variable refutation (applied or not); but there's no possible way that's well-typed, it has to somehow produce a (:~:), and Refl is really the only way.

    • Or it could be a case. WLOG, there is no nested case on the left, nor any constructor, because the expression could be simplified in both cases:

      -- Any left-nested case expression
      case (case (e) of { C x1 x2 -> (f) }) { D y1 y2 -> (g) }
      
      -- is equivalent to a right-nested case
      case (e) of { C x1 x2 -> case (f) of { D y1 y2 -> (g) } }
      
      
      
      -- Any case expression with a nested constructor
      case (C u v) of { C x1 x2 -> f x1 x2 }
      
      -- reduces to
      f u v
      

      So the last subcase is the variable case:

      lem = \refutation -> case refutation (_hole :: x :~: y) of {}
      

      and we have to construct a x :~: y. We enumerate ways of filling the _hole again. It's either Refl, but no proof is available, or (skipping some steps) case refutation (_anotherHole :: x :~: y) of {}, and we have an infinite descent on our hands, which is also absurd. A different possible argument here is that we can pull out the case from the application, to remove this case from consideration WLOG.

      -- Any application to a case
      f (case e of C x1 x2 -> g x1 x2)
      
      -- is equivalent to a case with the application inside
      case e of C x1 x2 -> f (g x1 x2)
      

    There are no more cases. The search is complete, and we didn't find an implementation of (x :~: y -> Void) -> Choose x y :~: 2. QED.

    To read more on this topic, I guess a course/book about lambda calculus up until the normalization proof of the simply-typed lambda calculus should give you the basic tools to start with. The following thesis contains an introduction on the subject in its first part, but admittedly I'm a poor judge of the difficulty of such material: Which types have a unique inhabitant? Focusing on pure program equivalence, by Gabriel Scherer.

    Feel free to suggest more adequate resources and literature.


    II. Fixing the proposition and proving it with dependent types

    Your initial intuition that this should encode a valid proposition is definitely valid. How might we fix it to make it provable?

    Technically, the type we are looking at is quantified with forall:

    forall x y. (x :~: y -> Void) -> Choose x y :~: 2
    

    An important feature of forall is that it is an irrelevant quantifier. The variables it introduces cannot be used "directly" in a term of this type. Although that aspect becomes more prominent in the presence of dependent types, it still pervades Haskell today, providing another intuition for why this (and many other examples) is not "provable" in Haskell: if you think about why you think that proposition is valid, you will naturally start with a case split about whether x is equal to y, but to even do such a case split you need a way to decide which side you're on, which will of course have to look at x and y, so they cannot be irrelevant. forall in Haskell is not at all like what most people mean with "for all".

    Some discussion on the matter of relevance can be found in the thesis Dependent Types in Haskell, by Richard Eisenberg (in particular, Section 3.1.1.5 for an initial example, Section 4.3 for relevance in Dependent Haskell and Section 8.7 for comparison with other languages with dependent types).

    Dependent Haskell will need a relevant quantifier to complement forall, and which would get us closer to proving this:

    foreach x y. (x :~: y -> Void) -> Choose x y :~: 2
    

    Then we could probably write this:

    lem :: foreach x y. (x :~: y -> Void) -> Choose x y :~: 2
    lem x y p = case x ==? u of
      Left r -> absurd (p r)  -- x ~ y, that's absurd!
      Right Irrefl -> Refl    -- x /~ y, so Choose x y = 2
    

    That also assumes a first-class notion of disequality /~, complementing ~, to help Choose reduce when it is in the context and a decision function (==?) :: foreach x y. Either (x :~: y) (x :/~: y). Actually, that machinery isn't necessary, that just makes for a shorter answer.

    At this point I'm making stuff up because Dependent Haskell does not exist yet, but that is easily doable in related dependently typed languages (Coq, Agda, Idris, Lean), modulo an adequate replacement of the type family Choose (type families are in some sense too powerful to be translated as mere functions, so may be cheating, but I digress).

    Here is a comparable program in Coq, showing also that lem applied to 1 and 2 and a suitable proof does reduce to a proof by reflexivity of choose 1 2 = 2.

    https://gist.github.com/Lysxia/5a9b6996a3aae741688e7bf83903887b


    III. Without dependent types

    A critical source of difficulty here is that Choose is a closed type family with overlapping instances. It is problematic because there is no proper way to express the fact that x and y are not equal in Haskell, to know that the first clause Choose x x does not apply.

    A more fruitful avenue if you're into Pseudo-Dependent Haskell is to use a boolean type equality:

    -- In the base library
    import Data.Type.Bool (If)
    import Data.Type.Equality (type (==))
    
    type Choose x y = If (x == y) 1 2
    

    An alternative encoding of equality constraints becomes useful for this style:

    type x ~~ y = ((x == y) ~ 'True)
    type x /~ y = ((x == y) ~ 'False)
    

    with that, we can get another version of the type-proposition above, expressible in current Haskell (where SBool is the singleton type of Bool), which essentially can be read as adding the assumption that the equality of x and y is decidable. This does not contradict the earlier claim about "irrelevance" of forall, the function is inspecting a boolean (or rather an SBool), which postpones the inspection of x and y to whoever calls lem.

    lem :: forall x y. SBool (x == y) -> ((x ~~ y) => Void) -> Choose x y :~: 2
    lem decideEq p = case decideEq of
      STrue -> absurd p
      SFalse -> Refl