Search code examples
haskelloverlapping-instances

Defining incremental environments without running into overlapping instances


How to define an environment to which we can add "capabilities" without running into overlapping instances?

Suppose we have the following data-types and type-classes:

type Name = String

data Fruit = Orange | Pear | Apple

data Vegetable = Cucumber | Carrot | Spinach

data Legume = Lentils | Chickpeas | BlackEyedPeas

class HasFruit e where
    getFruit :: e -> Name -> Maybe Fruit

class HasVegetable e where
    getVegetable :: e -> Name -> Maybe Vegetable

class HasLegume e where
    getLegume :: e -> Name -> Maybe Legume

Now we would like to define a couple of functions that require certain ingredients from the environment:

data Smootie

mkSmoothie :: (HasFruit e, HasVegetable e) => e -> Smootie
mkSmoothie = undefined

data Salad

mkSalad :: (HasVegetable e, HasLegume e) => e -> Salad
mkSalad = undefined

And we define some instances for Has*:

instance HasFruit [Fruit] where
    getFruit = undefined

instance HasVegetable [Vegetable] where
    getVegetable = undefined

instance HasLegume [Legume] where
    getLegume = undefined

And finally, we would like to define a function that prepares a smoothie and a salad:

cook :: (Smootie, Salad)
cook = let ingredients = undefined in
    (mkSmoothie ingredients, mkSalad ingredients)

Now the first question is, what to pass as ingredients to that the instances defined above can be used? My first solution to this was to use tuples:

instance HasFruit e0 => HasFruit (e0, e1, e2) where
    getFruit (e0, _, _) = getFruit e0

instance HasVegetable e1 => HasVegetable (e0, e1, e2) where
    getVegetable (_, e1, _) = getVegetable e1

instance HasLegume e2 => HasLegume (e0, e1, e2) where
    getLegume (_, _, e2) = getLegume e2

cook :: (Smootie, Salad)
cook = let ingredients = ([Orange], [Cucumber], [BlackEyedPeas]) in
    (mkSmoothie ingredients, mkSalad ingredients)

This, even though cumbersome, works. But now assume that we decide to add a mkStew, which requires some HasMeat instance. Then we'd have to change all the instances above. Furthermore, if we would like to use mkSmothie in isolation, we cannot just pass ([Orange], [Cucumber]) since there is no instance defined for it.

I could define:

data Sum a b = Sum a b

and instances like:

instance HasFruit e0 => HasFruit (Sum e0 e1) where
    getFruit (Sum e0 _) = getFruit e0

instance HasVegetable e1 => HasVegetable (Sum e0 e1) where
    getVegetable (Sum _ e1) = getVegetable e1

instance HasLegume e1 => HasLegume (Sum e0 e1) where
    getLegume (Sum _ e1) = getLegume e1

But the following won't work (No instance for HasVegetable [Legume]):

cook1 :: (Smootie, Salad)
cook1 = let ingredients = Sum [Orange] (Sum [Cucumber] [BlackEyedPeas]) in
    (mkSmoothie ingredients, mkSalad ingredients)

And This instance will overlap!

instance HasVegetable e0 => HasVegetable (Sum e0 e1) where
    getVegetable (Sum e0 e1) = getVegetable e0

Is there a way to solve this problem in an elegant way?


Solution

  • The problem with the present Sum instances is that we don't know whether the object we are looking for is to the left or to the right.

    Here's the plan: each component of the environment should declare what capabilities it offers so that we can then search for it.

    Gist of this answer.

    Declaring capabilities

    As environments will be composed, we will need a (type-level) data structure to carry the capabilities from their different parts. We will use a binary tree, so we can preserve the structure of components.

    -- Tree of capabilities (ingredient categories)
    data Tree a = Leaf a | Node (Tree a) (Tree a)
    

    Capabilities associated with an environment are declared via this type family.

    type family Contents basket :: Tree *
    
    type instance Contents [Fruit] = 'Leaf Fruit
    type instance Contents [Vegetable] = 'Leaf Vegetable
    type instance Contents [Legume] = 'Leaf Legume
    
    -- Pair of environments
    data a :& b = a :& b  -- "Sum" was confusing
    
    -- The capabilities of a pair are the pair of their capabilities.
    type instance Contents (a :& b) = 'Node (Contents a) (Contents b)
    
    -- e.g., Contents ([Fruit] :& [Vegetable]) = 'Node ('Leaf Fruit) ('Leaf Vegetable)
    

    Looking up a capability

    As mentioned at the beginning, when encountering a pair :&, we will need to tell whether to find the capability in the left or right component. Thus we start with a function (at the type level) that returns True if the capability can be found in the tree.

    type family In (x :: *) (ys :: Tree *) :: Bool where
      In x (Leaf y) = x == y
      In x (Node l r) = In x l || In x r
    
    type family x == y :: Bool where
      x == x = 'True
      x == y = 'False
    

    The Has class

    This class now has a superclass constraint: that the capability we are looking for is indeed available.

    class (In item (Contents basket) ~ 'True)
          => Has item basket where
      get :: basket -> Name -> Maybe item
    

    It may seem superfluous, because instance resolution would fail anyway if the capability is not found, but a precise superclass constraint has benefits:

    • preventing mistakes: the compiler will complain earlier if something is missing;

    • a form of documentation, informing us of when an instance may exist.

    Leaf instances

    instance Has Fruit [Fruit] where
      get = (...)
    
    instance Has Vegetable [Vegetable] where
      get = (...)
    
    instance Has Legume [Legume] where
      get = (...)
    

    We don't need to write dubious instances like Has Fruit [Vegetable]; we actually can't: they would contradict the superclass constraint.

    Instance for (:&)

    We need to defer to a new class, PairHas that will discriminate on the result of the In predicate on both sides to determine which part of the environment to zoom in.

    instance PairHas item a b (In item (Contents a)) (In item (Contents b))
             => Has item (a :& b) where
      get = getPair
    

    Again, we make the superclass constraints extra precise about the intent of PairHas. inA and inB can only be instantiated with In item (Contents a) and In item (Contents b) respectively, and their disjunction should be True, meaning that item can be found in at least one of them.

    class ( In item (Contents a) ~ inA
          , In item (Contents b) ~ inB
          , (inA || inB) ~ 'True)
          => PairHas item a b inA inB where
      getPair :: (a :& b) -> Name -> Maybe item
    

    Of course we have two instances to go left and right respectively, using recursive Has constraints (note that Has provides one equality via its own superclass constraint).

    instance ( Has item a
             , In item (Contents b) ~ 'False)
             => PairHas item a b 'True 'False where
      getPair (a :& _) = get a
    
    instance ( In item (Contents a) ~ 'False
             , Has item b)
             => PairHas item a b 'False 'True where
      getPair (_ :& b) = get b
    

    What if both sides have the same capability? We shall consider that an error and require the user to explicitly hide one of the duplicate capabilities via other mechanisms. We can use TypeError to print a custom error message at compile-time. We could also pick either side by default.

    instance (TypeError (Text "Duplicate contents")  -- can be more descriptive
             , In item (Contents a) ~ 'True
             , In item (Contents b) ~ 'True)
             => PairHas item a b 'True 'True where
      getPair = undefined
    

    We can also write a custom error message for the case when both sides are false. It is a bit surprising because that contradicts the superclass constraint (inA || inB) ~ 'True, but the message does get printed so we won't complain.

    instance ( TypeError (Text "Not found")  -- can be more descriptive
             , In item (Contents a) ~ 'False
             , In item (Contents b) ~ 'False
             , 'False ~ 'True)
             => PairHas item a b 'False 'False where
      getPair = undefined
    

    Let's cook

    Now we can safely write cook:

    cook :: (Smootie, Salad)
    cook = let ingredients = [Orange] :& [Cucumber] :& [BlackEyedPeas] in
      (mkSmootie ingredients, mkSalad ingredients)
    

    You can also see what happens if you duplicate or forget some ingredients

    cook :: (Smootie, Salad)
    cook = let ingredients = [Orange] :& [Cucumber] :& [BlackEyedPeas] :& [Pear] in
      (mkSmootie ingredients, mkSalad ingredients)
    
    -- error: Duplicate contents
    

    cook :: (Smootie, Salad)
    cook = let ingredients = [Orange] :& [Cucumber] in
      (mkSmootie ingredients, mkSalad ingredients)
    
    -- error: Not found