Search code examples
haskellfunctoralgebraic-data-types

Functor instance for a simple algebraic data type


I would like to use heterogeneous lists of lists. To this effect, I define a simple algebraic data type:

data T = LInt [Int]
       | LChar [Char]
       deriving (Eq, Ord, Show)

so I can now have something like this:

x = [LInt [1, 2, 3], LChar "test"]

My question: can this type be made an instance of Functor? This would be useful; for example to select elements of a list in x, as in

fmap (!!2) LChar "test"  => 's'

It seems to me that this is not possible. Aside from the motivation for the question, I believe the answer would help me understand Algebraic Data Types better.


Solution

  • Short Answer:

    No, it can't be made into a Functor.

    Long Answer:

    The first reason this can not be made into a function is that functors must have the kind * -> *. Kinds are like the types of types, you can even check them in GHCi using :kind <type>. For example:

    > :kind Int
    Int :: *
    > :kind []
    [] :: * -> *
    > :kind Num
    Num :: * -> Constraint
    > :kind Maybe
    Maybe :: * -> *
    > :kind Either
    Either :: * -> * -> *
    > :kind Functor
    Functor :: (* -> *) -> Constraint
    

    The * basically means a fully applied type, such as Int, Char, [String], etc, and something like * -> * means the type takes a single type of kind * to return a new type of kind *. Constraints also have kinds, namely that they return the kind Constraint when fully applied.

    Your type has kind *, which does not match * -> * required as the argument to Functor. In order for it to become a Functor it would need to accept a type variable. Adding a type variable here doesn't make much sense, but you could have

    data T a = LInt [a] | LChar [a]
    

    But this isn't very useful, we now can't enforce that LInt only contains Ints and LChar only contains Chars. Worse still, looking at the type of fmap we have

    class Functor f where
        fmap :: (a -> b) -> (f a -> f b)
    

    But what you're wanting to do is something like

    myfmap :: (a -> b) -> (f a -> b)
    

    Notice how the return type is b instead of f b. The fmap function only transforms values inside a container, it doesn't extract values from said container.

    It would be possible to write a parameterized type that is constrained using -XGADTs, though, you could write

    data T a where
        LInt :: [Int] -> T Int
        LChar :: [Char] -> T Char
    

    And this would guarantee the types are sane, but it's still not possible to make this into an instance of Functor (that satisfies the functor laws, that is) and it would prevent you from making a heterogeneous list (T Int /~ T Char).

    So it really looks like the Functor option is just right out. You might find it tempting to write a function like

    tmap :: ([a] -> b) -> T -> b
    tmap f (LInt x) = f x
    tmap f (LChar x) = f x
    

    But this won't work either. The type system sees that you're trying to say that f :: [Int] -> b and f :: [Char] -> b, which can't be unified. You can do this by enabling -XRankNTypes though:

    tmap :: (forall a. [a] -> b) -> T -> b
    tmap f (LInt x) = f x
    tmap f (LChar x) = f x
    

    And this does allow you to do something like

    > tmap length (LInt [1, 2, 3])
    3
    > tmap length (LChar "test")
    4
    

    But it won't let you do

    > tmap (!! 2) (LChar "test")
    Couldn't match type 'b' with 'a'
      because type variable 'a' would escape its scope
    This (rigid, skolem) type variable is bound by
      a type expected by the context: [a] -> b
    Expected type: [a] -> b
      Actual type: [a] -> a
    ...
    

    What this means is that the type a can't appear anywhere in the output type b, since the f passed in has to work for all a, it can't work for just any a.

    In conclusion, without having to dip even further into type system madness your type can't be made to do what you want it to. You're going to have to write specialized functions to handle each case individually, which is pretty much the point of ADTs. The compiler can ensure that you do actually handle each case, and so long as you stay away from functions that return undefined or call error, then your program will be safe. It may not be as flexible as you'd like, but it'll be rock solid in terms of safety.