Search code examples
haskellreflectionunionsgadtdata-kinds

Type-safe union in Haskell?


Can I have a type-safe union (As in C's union) in Haskell? This is the best I tried, here Variant, named after C++'s std::variant:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}

module Emulation.CPlusPlus.Variant (
    Variant, singleton
) where

import Data.Type.Bool
import Data.Type.Equality
import Type.Reflection

data Variant :: [*] -> * where
    Singleton :: a -> Variant (a ': as)
    Arbitrary :: Variant as -> Variant (a ': as)

singleton :: (Not (bs == '[]) || a == b) ~ 'True => forall a b. a -> Variant (b ': bs)
singleton x = case eqTypeRep (typeRep :: TypeRep a) (typeRep :: TypeRep b) of
    Nothing    -> Arbitrary (singleton x)
    Just HRefl -> Singleton x

This produces error message as follows:

Prelude> :load Variant.hs
[1 of 1] Compiling Emulation.CPlusPlus.Variant ( Variant.hs, interpreted )

Variant.hs:19:14: error:
    • Could not deduce: (Not (bs == '[]) || (a0 == b0)) ~ 'True
      from the context: (Not (bs == '[]) || (a == b)) ~ 'True
        bound by the type signature for:
                   singleton :: forall (bs :: [*]) a b.
                                ((Not (bs == '[]) || (a == b)) ~ 'True) =>
                                forall a1 b1. a1 -> Variant (b1 : bs)
        at Variant.hs:19:14-85
      The type variables ‘a0’, ‘b0’ are ambiguous
    • In the ambiguity check for ‘singleton’
      To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
      In the type signature:
        singleton :: (Not (bs == '[]) || a == b) ~ True =>
                     forall a b. a -> Variant (b : bs)
   |
19 | singleton :: (Not (bs == '[]) || a == b) ~ True => forall a b. a -> Variant (b ': bs)
   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Failed, no modules loaded.

I don't get how such ambiguity emerges.


Solution

  • The conventional names for the constructors are Inl and Inr:

    import Data.Kind
    
    data Sum :: [Type] -> Type where
      Inl :: a -> Sum (a : as) -- INject Left
      Inr :: !(Sum as) -> Sum (a : as) -- INject Right
    

    The extra strictness in Inr is optional. Consider Either a b. This type has the values undefined, Left undefined, and Right undefined, along with all the other ones. Consider your Variant [a, b]. This has undefined, Singleton undefined, Variant undefined, and Variant (Singleton undefined). There is one extra partially-undefined value that does not arise with Either. Inr's strictness squashes Inr undefined and undefined together. This means you cannot have a value with only a "partially known" variant. All strictness annotations in the following are for "correctness." They squash out bottoms in places where you probably don't want bottoms.

    Now, the signature of singleton, as pointed out by @rampion, has a use-before-definition error. It "ought" to be:

    singleton :: forall a b.
                 (Not (bs == '[]) || a == b) ~ True =>
                 a -> Variant (b ': bs)
    

    But that isn't quite right. If a ~ b, great, this works. If not, there is no way for the compiler to ensure that a is in bs, because you haven't constrained for that. This new signature still fails. For the most power, especially for future definitions, you'll want something like

    -- Elem x xs has the structure of a Nat, but doubles as a proof that x is in xs
    -- or: Elem x xs is the type of numbers n such that the nth element of xs is x
    data Elem (x :: k) (xs :: [k]) where
      Here  :: Elem x (x : xs)
      There :: !(Elem x xs) -> Elem x (y : xs) -- strictness optional
    -- boilerplate; use singletons or similar to dodge this mechanical tedium
    -- EDIT: singletons doesn't support GADTs just yet, so this must be handwritten
    -- See https://github.com/goldfirere/singletons/issues/150
    data SElem x xs (e :: Elem x xs) where
      SHere  :: SElem x (x : xs) Here
      SThere :: SElem x xs e -> SElem x (y : xs) (There e)
    class KElem x xs (e :: Elem x xs) | e -> x xs where kElem :: SElem x xs e
    instance KElem x (x : xs) Here where kElem = SHere
    instance KElem x xs e => KElem x (y : xs) (There e) where kElem = SThere kElem
    demoteElem :: SElem x xs e -> Elem x xs
    demoteElem SHere = Here
    demoteElem (SThere e) = There (demoteElem e)
    
    -- inj puts a value into a Sum at the given index
    inj :: Elem t ts -> t -> Sum ts
    inj Here x = Inl x
    inj (There e) x = Inr $ inj e x
    
    -- try to find the first index where x occurs in xs
    type family FirstIndexOf (x :: k) (xs :: [k]) :: Elem x xs where
      FirstIndexOf x (x:xs) = Here
      FirstIndexOf x (y:xs) = There (FirstIndexOf x xs)
    -- INJect First
    -- calculate the target index as a type
    -- require it as an implicit value
    -- defer to inj
    injF :: forall as a.
            KElem a as (FirstIndexOf a as) =>
            a -> Sum as
    injF = inj (demoteElem $ kElem @a @as @(FirstIndexOf a as))
    -- or injF = inj (kElem :: SElem a as (FirstIndexOf a as))
    

    You can also just stick an Elem inside Sum:

    data Sum :: [Type] -> Type where
      Sum :: !(Elem t ts) -> t -> Sum ts -- strictness optional
    

    You may recover Inl and Inr as pattern synonyms

    pattern Inl :: forall ts. () =>
                   forall t ts'. (ts ~ (t : ts')) =>
                   t -> Sum ts
    pattern Inl x = Sum Here x
    
    data Inr' ts = forall t ts'. (ts ~ (t : ts')) => Inr' (Sum ts')
    _Inr :: Sum ts -> Maybe (Inr' ts)
    _Inr (Sum Here _) = Nothing
    _Inr (Sum (There tag) x) = Just $ Inr' $ Sum tag x
    pattern Inr :: forall ts. () =>
                   forall t ts'. (ts ~ (t : ts')) =>
                   Sum ts' -> Sum ts
    pattern Inr x <- (_Inr -> Just (Inr' x))
      where Inr (Sum tag x) = Sum (There tag) x
    

    If you try some more, you can use huge amounts of unsafeCoerce Refl (to create "bogus" type equalities) to create an API like this:

    import Numeric.Natural
    -- ...
    type role SElem nominal nominal nominal
    -- SElem is a GMP integer
    -- Elem is a nice GADT
    -- Elem gives a nice experience at the type level
    -- this allows functions like FirstIndexOf
    -- SElem avoids using unary numbers at the value level
    newtype SElem x xs e = SElem Natural
    pattern SHere :: forall t ts e. () =>
                     forall ts'. (ts ~ (t:ts'), e ~ (Here :: Elem t (t:ts'))) =>
                     SElem t ts e
    pattern SThere :: forall t ts e. () =>
                      forall t' ts' e'. (ts ~ (t':ts'), e ~ (There e' :: Elem t (t':ts'))) =>
                      SElem t ts' e' ->
                      SElem t ts e
    -- implementations are evil and kinda long
    -- you'll probably need this:
    -- type family Stuck (k :: Type) :: k where {- no equations -}
    -- pattern synonyms are incredibly restrictive, so they aren't very straightforward
    -- currently GHC does not allow INLINEs on pattern synonyms, so this won't
    -- compile down to simple integer expressions just yet, either :(
    

    And then write

    data Sum :: [Type] -> Type where
      Sum :: forall t ts (e :: Elem t ts). !(SElem t ts e) -> t -> Sum ts
    

    which is close to a struct of an integer tag and a union, except said tag is a bit oversized.