Search code examples
haskellgadt

Union in GADT type variable


I have a GADT constructor representing the same idea, but I need two of them because the type is flexible depending on context. See this contrived example:

data A
data B
data C

data Thing a where
  AFoo :: String -> Thing A
  Bar  :: Float -> Thing A
  BFoo :: String -> Thing B
  Baz  :: Int -> Thing B
  Bah  :: Char -> Thing C

AFoo and BFoo represent the same underlying data concept: some data with a string. But in this context, it's more than that. Ideally AFoo and BFoo would be merged into Foo, because they represent exactly the same thing, but I need one that types like Thing A and one that types like Thing B. As mentioned before, some contexts in which a Foo can be used require a Thing A and some require a Thing B, so to satisfy the type system one constructor needs to exist for each.

I could of course write a function that "casts" to the desired type by switching the constructor if needed:

fooAsThingA :: Thing a -> Maybe (Thing A)
fooAsThingA t@(AFoo _) = Just t
fooAsThingA (BFoo s) = Just $ AFoo s
fooAsThingA _ = Nothing

(And similarly for fooAsThingB)

But this is ugly, because it requires a Maybe that I have to propagate because not all Thing Bs can become Thing As (in fact, only BFoo can).

Ideally, I'd like to write:

data A
data B
data C

data Thing a where
  Foo :: String -> Thing ??
  Bar :: Float -> Thing A
  Baz :: Int -> Thing B
  Bah :: Char -> Thing C

But I'm unclear on what I'd put in place of the ??. Presumably it's some way of representing the union of A and B (and perhaps this requires changing the types of the other constructors for this GADT, but that's fine).

Just to be clear, if the above were valid, I'd expect given the following functions:

processThingA :: Thing A -> String
processThingB :: Thing B -> Int
processThingC :: Thing C -> Float

Then, I'd be able to do all of the following:

processThingA $ Foo "Hello"
processThingB $ Foo "World"

processThingA $ Bar 3.14
processThingB $ Baz 42

processThingC $ Bah '\n'

tl;dr In the first snippet, is it possible to merge AFoo and BFoo into a Foo that types as both Thing A and Thing B?

edit: Note: there are other EmptyDataDecls than A and B that I use for Thing a. I added C as an example.


Solution

  • A possible solution is as follows, even I do not consider it elegant.

    We first define a GADT and a type class for being "A or B":

    {-# LANGUAGE DataKinds, KindSignatures, GADTs, EmptyCase,
                 ScopedTypeVariables #-}
    {-# OPTIONS -Wall #-}
    
    data Sort = A | B | C | D
    
    data AorB (s :: Sort) where
      IsA :: AorB 'A
      IsB :: AorB 'B
    
    class IsAorB (s :: Sort) where
      aorb :: AorB s
    
    instance IsAorB 'A where aorb = IsA
    instance IsAorB 'B where aorb = IsB
    

    Then we exploit our type class in our type. This will cause Foo to require some more space, since it needs to store the typeclass dictionary at runtime. It is a small amount of overhead, but it is still unfortunate. On the other hand, this will also make it possible to discern, at runtime, whether the s used in Foo is actually A or B.

    data Thing (s :: Sort) where
      Foo :: IsAorB s => String -> Thing s 
      Bar :: Float -> Thing 'A
      Baz :: Int -> Thing 'B
      Bah :: Char -> Thing 'C
    

    Some tests:

    processThingA :: Thing 'A -> String
    processThingA (Foo x) = x
    processThingA (Bar x) = show x
    
    processThingB :: Thing 'B -> Int
    processThingB (Foo _) = 0
    processThingB (Baz i) = i
    

    A main downside is that we need to convince the exhaustiveness checker that our pattern matching is OK.

    -- This unfortunately will give a spurious warning
    processThingC :: Thing 'C -> Float
    processThingC (Bah _) = 42.2
    -- To silence the warning we need to prove that this is indeed impossible
    processThingC (Foo _) = case aorb :: AorB 'C of {}
    
    processThingAorB :: forall s. IsAorB s => Thing s -> String
    processThingAorB (Foo x) = x
    processThingAorB (Bar x) = "Bar " ++ show x
    processThingAorB (Baz x) = "Baz " ++ show x
    -- Again, to silence the warnings
    processThingAorB (Bah _) = case aorb :: AorB 'C of {}
    
    test :: ()
    test = ()
      where
      _ = processThingA $ Foo "Hello"
      _ = processThingB $ Foo "World"
      _ = processThingA $ Bar 3.14
      _ = processThingB $ Baz 42
      _ = processThingC $ Bah '\n'
    

    This technique does not scale very well. We need a custom GADT and typeclass for any constructor of Thing which can generate any tag in some "union". This can still be OK. To check exhaustiveness, we would need to exploit all these classes.

    I think this should require a linear amount of boilerplate, but it still significant.

    Perhaps using singletons is simpler, in the general case, to store the value of s in the constructor Foo, and avoid all the custom GADTs and typeclasses.