Search code examples
haskelltypestype-inferencealgebraic-data-typesgadt

Automatically deriving type class constraints using GADTs


I'm writing a library to deal with infinite sequences using lazy evaluation. For brevity I'm using Generalized Algebraic Data Types (GADTs) to assert an Ord constraint on the index of each term in the sequence. Hence the following typechecks:

{-# LANGUAGE GADTs #-}

data Term ix cff where
  Term :: (Ord ix) => ix -> cff -> Term ix cff

data Sequence ix cff where
  Seq :: [Term ix cff] -> Sequence ix cff

f (Seq (Term i x:_)) (Seq (Term j y:_)) = i < j

{-
add :: Sequence ix cff -> Sequence ix cff -> Sequence ix cff
add (Seq tms1) (Seq tms2) = Seq (addAlong (<) tms1 tms2)
    where addAlong :: (ix -> ix -> Bool) ->
                      [Term ix cff] -> [Term ix cff] -> [Term ix cff]
          addAlong ord (Term i x : tms1) (Term j y : tms2) = []
-}

As expected, GHCi tells me that f :: Sequence t t1 -> Sequence t t2 -> Bool. Doing the comparison i < j requires an Ord instance, but that is taken care of by the constraint (Ord ix) in the definition of Term.

However, when the lower block is uncommented, the add function fails to typecheck with the error No instance for (Ord ix) arising from the use of ``<''. Shouldn't it be able to figure out (Ord ix) from the Term ix cff that appears in the definition of Sequence?


Solution

  • Terms bound to a function argument are by default monomorphic (aka rank-0 polymorphic), because Haskell98 only supports rank-1 polymorphism, and a polymorphic argument makes a function rank-2 polymorphic.

    Hence, in Seq (addAlong (<) tms1 tms2), the compiler can only consider < as a monomorphic comparison on the rigid type ix. To consider < as a monomorphic function the compiler needs to resolve the Ord instance. However, at that point the Ord ix instance is not available, since it can only be matched out of a Term!

    The solution that's closest to your original code is to explicitly make addAlong rank-2 polymorphic:

    {-# LANGUAGE RankNTypes, UnicodeSyntax #-}
    
    add :: Sequence ix cff -> Sequence ix cff -> Sequence ix cff
    add (Seq tms1) (Seq tms2) = Seq (addAlong (<) tms1 tms2)
        where addAlong :: (∀ ix' . Ord ix' -> ix' -> Bool) ->
                          [Term ix cff] -> [Term ix cff] -> [Term ix cff]
              addAlong ord (Term i x : tms1) (Term j y : tms2) = []
    

    This way, < is simply passed as it is (as a polymorphic Ord => ... method), hence the compiler doesn't need to have the instance available at Seq (addAlong (<) tms1 tms2) but can resolve it later when you have Terms available.

    You should of course consider whether you really need that. Keeping an Ord dictionary in each Term seems rather wasteful to me – the problem wouldn't turn up if you kept the constraint in Seq instead:

    data Term ix cff where Term :: ix -> cff -> Term ix cff
    
    data Sequence ix cff where
      Seq :: Ord ix => [Term ix cff] -> Sequence ix cff
    
    add :: Sequence ix cff -> Sequence ix cff -> Sequence ix cff
    add (Seq tms1) (Seq tms2) = Seq (addAlong (<) tms1 tms2)
        where addAlong :: (ix -> ix -> Bool) ->
                          [Term ix cff] -> [Term ix cff] -> [Term ix cff]
              addAlong ord (Term i x : tms1) (Term j y : tms2) = []