I want to implement a simple one-to-many data structure in Haskell. Each key of type k
associated with set of elements of type v
.
{-# LANGUAGE AllowAmbiguousTypes #-}
module SetMap where
import Data.Map
import Data.Set
class SetMap k v m where
add :: k -> v -> m -> m -- add value to corresponding set
delete :: k -> m -> m -- delete all values associated with this key
get :: k -> m -> [v] -- get values associated with this key
type SetMapImpl k v = Map k (Set v)
instance SetMap k v (SetMapImpl k v) where -- duplication here!
...
m
here is a type of implementation. But it is also parametrized with k
and v
.
Any ways how can I declare it? Or is it OK to do as shown above?
I expect something like this but it does not compile.
class SetMap k v m where
add :: k -> v -> m k v -> m k v-- add value to corresponding set
delete :: k -> m k v -> m k v -- delete all values associated with this key
get :: k -> m k v -> [v] -- get values associated with this key
type SetMapImpl k v = Map k (Set v)
instance SetMap k v SetMapImpl where
...
You have several options:
First consider why you need a class at all. Why not just implement those methods right away as concrete functions?
type SetMapImpl k v = Map k (Set v) -- In fact I would tend to call this
-- simply `SetMap` in this version.
add :: k -> v -> SetMapImpl k v -> SetMapImpl k v
add = ...
delete :: k -> SetMapImpl k v -> SetMapImpl k v
delete = ...
...
Notice that you don't need a newtype in this case, because SetMapImpl
is only ever mentioned with both arguments fully applied. But it might still be a good idea to make it a newtype, for clearer encapsulation, error messages etc..
Obviously, this approach means you can't write code that's polymorphic over different implementations, but unless you have a good reason to write such code it is best not to worry about this. Keep it simple. Thanks to Haskell's strong type system it is still easy to generalize your code later on, should that become necessary.
If you want the methods to look like add :: k -> v -> m k v -> m k v
, then k
and v
are parameters of the type and of the methods, but not of the abstraction that's expressed by the class. Hence this should rather look like this:
class SetMap m where
add :: k -> v -> m k v -> m k v
In this version, you must make the SetMapImpl
a newtype
or data
, because m
is mentioned without parameters in the class head. That's not much of a problem though. What is a problem is that the add
method would now have to work with any types k
and v
, which it can't: for Set
you need the Ord
constraint. There are a couple of ways this can be achieved:
class SetMap m where
add :: (Ord k, Ord v) => k -> v -> m k v -> m k v
newtype SetMapImpl k v = ...
instance SetMap SetMapImpl where
add = ...
Simple, but inflexible. In particular, a main motivation for having a class at all is that you could have other implementations using other constraints like Hashable
; this approach does not support that.
{-# LANGUAGE TypeFamilies, ConstraintKinds #-}
import Data.Kind (Constraint)
class SetMap m where
type KeyConstraint m k :: Constraint
type ValueConstraint m v :: Constraint
add :: (KeyConstraint m k, ValueConstraint m v)
=> k -> v -> m k v -> m k v
newtype SetMapImpl k v = ...
instance SetMap SetMapImpl where
type KeyConstraint SetMapImpl k = Ord k
type ValueConstraint SetMapImpl v = Ord v
add = ...
I rather like this approach, because it expresses that the SetMap
-implementations are parametric, but still allows imposing any required constraint on the contained types. It is a bit complicated to get this right, though, and easy to get confused with the different abstract constraints.
Since all your methods take one already existing set-map as the argument, and that set-map would already need the knowledge that the keys and values are Ord
(or Hashable
) to be constructed in the first place. So you can in fact get away without any constraints on the method. But you do need to wrap the constraints in the type itself.
{-# LANGUAGE GADTs #-}
class SetMap m where
add :: k -> v -> m k v -> m k v
data SetMapImpl k v where
SetMapImpl :: (Ord k, Ord v) => Map k (Set v) -> SetMapImpl k v
instance SetMap SetMapImpl where
add k v (SetMapImpl m) = ...
And now in the ...
you will have the Ord k
and Ord v
constraints available though the class knows nothing about them.
I would rather not recommend this approach though. It tends to become awkward having to pass the constraints at the value level, and for generically creating new set-maps you'll need them on the type level anyway.
See chi's answer. This is what I would probably go with. Although it is arguably less elegant that the class doesn't have m
in parameterised form at all, this is not really a restriction (since the instance can still be polymorphic over all appropriate key- and value types, and you can simply state the constraints right there). This approach is more explicit, and in my experience it tends to be a lot clearer what you're doing when the key- and value types are actually called Key m
and Value m
. This does come with some verbosity penalty, but it's probably worth it.
Another advantage is that this can easily deal with implementations that really are not parametric at all; e.g. IntMap
only allows keys of type Int
.