Search code examples
haskelltypesmetaprogrammingtype-systemshindley-milner

Programmatic type annotations in Haskell


When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?

Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like

import Data.Dynamic
import Data.Typeable

dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)

and call it with a manual type annotation. For example,

foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
                    , toDyn (2 :: Int)
                    , toDyn ("foo" :: String) ]

Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.

Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.

For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:

import Data.Dynamic
import Data.Typeable

randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
    tr <- randItem $ map dynTypeRep dl
    return (dynListToList dl :: [<tr>])  -- This thing should have the type
                                         -- represented by `tr`

(Assume randItem selects a random item from a list.)

Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).

The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?

Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)

Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.

(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)


Solution

  • No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.

    Existentials

    Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads

    randList :: forall a. Typeable a => [Dynamic] -> IO [a]
    

    Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like

    randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
    

    This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.

    Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:

    Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)

    randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
    

    This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.

    And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3

    Dependent types

    OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:

    randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
    

    This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.

    Basically, your goal is to write something like

    toType :: TypeRep -> *
    

    Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)

    With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):

    randList :: [Dynamic] -> (exists (tr :: TypeRep).
                               Typeable (toType tr) => IO (tr, [toType tr]))
    randList dl = do
      tr <- randItem $ map dynTypeRep dl
      return (tr, dynListToList dl :: [toType tr])
        -- In fact, in an ideal world, the `:: [toType tr]` signature would be
        -- inferable.
    

    Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.

    What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)

    Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)

    Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.

    In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!

    What now?

    So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:

    data MyType = MTInt Int | MTBool Bool | MTString String
    

    Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).

    Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.

    I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.


    1 That looks like the following:

    {-# LANGUAGE ExistentialQuantification #-}
    data TypeableList = forall a. Typeable a => TypeableList [a]
    randList :: [Dynamic] -> IO TypeableList
    

    However, since none of this code compiles anyway, I think writing it out with exists is clearer.

    2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)

    3 To see the error in action, you can try to compile the following complete Haskell program:

    {-# LANGUAGE ExistentialQuantification #-}
    import Data.Dynamic
    import Data.Typeable
    import Data.Maybe
    
    randItem :: [a] -> IO a
    randItem = return . head -- Good enough for a short and non-compiling example
    
    dynListToList :: Typeable a => [Dynamic] -> [a]
    dynListToList = mapMaybe fromDynamic
    
    data TypeableList = forall a. Typeable a => TypeableList [a]
    
    randList :: [Dynamic] -> IO TypeableList
    randList dl = do
      tr <- randItem $ map dynTypeRep dl
      return . TypeableList $ dynListToList dl -- Error!  Ambiguous type variable.
    

    Sure enough, if you try to compile this, you get the error:

    SO12273982.hs:17:27:
        Ambiguous type variable `a0' in the constraint:
          (Typeable a0) arising from a use of `dynListToList'
        Probable fix: add a type signature that fixes these type variable(s)
        In the second argument of `($)', namely `dynListToList dl'
        In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
        In the expression:
          do { tr <- randItem $ map dynTypeRep dl;
               return . TypeableList $ dynListToList dl }
    

    But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.

    4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.

    5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.