Search code examples
haskelltypeclassparametric-polymorphism

How to elegantly avoid "Ambiguous type variable" when using Haskell type classes


I want to write a simple framework that deals with persisting entities. The idea is to have an Entity type class and to provide generic persistence operations like

storeEntity    :: (Entity a) => a -> IO () 
retrieveEntity :: (Entity a) => Integer -> IO a
publishEntity  :: (Entity a) => a -> IO () 

Actual data types are instance of that Entity type class.

Even though the persistence operations are generic and don't need any information about the concrete data types You have to provide a type annotation at the call site to make GHC happy, like in:

main = do
    let user1 = User 1 "Thomas" "Meier" "tm@meier.com"
    storeEntity user1
    user2 <- retrieveEntity 1 :: IO User -- how to avoid this type annotation?
    publishEntity user2

Is there any way to avoid this kind of call site annotations?

I know that I don't need these annotations if the compiler can deduce the actual type from the context of the usage. So for example the following code works fine:

main = do
    let user1 = User 1 "Thomas" "Meier" "tm@meier.com"
    storeEntity user1
    user2 <- retrieveEntity 1
    if user1 == user2
        then publishEntity user2
        else fail "retrieve of data failed"

But I would like to be able to chain the polymorphic actions like so:

main = do
    let user1 = User 1 "Heinz" "Meier" "hm@meier.com"
    storeEntity user1
    -- unfortunately the next line does not compile
    retrieveEntity 1 >>= publishEntity

    -- but with a type annotation it works:
    (retrieveEntity 1 :: IO User) >>= publishEntity

But having a type annotation here breaks the elegance of the polymorphism...

For completeness sake I've included the full source code:

{-# LANGUAGE DeriveGeneric, DeriveAnyClass #-}
module Example where
import GHC.Generics
import Data.Aeson

-- | Entity type class
class (ToJSON e, FromJSON e, Eq e, Show e) => Entity e where 
    getId :: e -> Integer

-- | a user entity    
data User = User {
      userId    :: Integer
    , firstName :: String
    , lastName  :: String
    , email     :: String
} deriving (Show, Eq, Generic, ToJSON, FromJSON)

instance Entity User where
    getId = userId 


-- | load persistent entity of type a and identified by id
retrieveEntity :: (Entity a) => Integer -> IO a
retrieveEntity id = do
    -- compute file path based on id
    let jsonFileName = getPath id
    -- parse entity from JSON file
    eitherEntity <- eitherDecodeFileStrict jsonFileName
    case eitherEntity of
        Left msg -> fail msg
        Right e  -> return e

-- | store persistent entity of type a to a json file
storeEntity :: (Entity a) => a -> IO ()
storeEntity entity = do
    -- compute file path based on entity id
    let jsonFileName = getPath (getId entity)
    -- serialize entity as JSON and write to file
    encodeFile jsonFileName entity

-- | compute path of data file based on id
getPath :: Integer -> String
getPath id = ".stack-work/" ++ show id ++ ".json"

publishEntity :: (Entity a) => a -> IO ()   
publishEntity = print

main = do
    let user1 = User 1 "Thomas" "Meier" "tm@meier.com"
    storeEntity user1
    user2 <- retrieveEntity 1 :: IO User
    print user2

Solution

  • You can tie the types of storeEntity and retrieveEntity together by adding a type-level tag to your entity's identifier Integer. I think your API design also has a small infelicity that isn't critical, but I'll fix it along the way anyway. Namely: Users shouldn't store their identifier. Instead have a single top-level type wrapper for identified things. This lets you write code once and for all that munges identifiers -- e.g. a function that takes an entity that doesn't yet have an ID (how would you even represent this with your definition of User?) and allocates a fresh ID for it -- without going back and modifying your Entity class and all its implementations. Also storing first and last names separately is wrong. So:

    import Data.Tagged
    
    data User = User
        { name :: String
        , email :: String
        } deriving (Eq, Ord, Read, Show)
    
    type Identifier a = Tagged a Integer
    data Identified a = Identified
        { ident :: Identifier a
        , val :: a
        } deriving (Eq, Ord, Read, Show)
    

    Here my Identified User corresponds to your User, and my User doesn't have an analog in your version. The Entity class might look like this:

    class Entity a where
        store :: Identified a -> IO ()
        retrieve :: Identifier a -> IO a
        publish :: a -> IO () -- or maybe Identified a -> IO ()?
    
    instance Entity User -- stub
    

    As an example of the "write it once and for all" principle above, you may find it convenient for retrieve to actually associate the entity it returns with its identifier. This can be done uniformly for all entities now:

    retrieveIDd :: Entity a => Identifier a -> IO (Identified a)
    retrieveIDd id = Identified id <$> retrieve id
    

    Now we can write an action that ties together the types of its store and retrieve actions:

    storeRetrievePublish :: Entity a => Identified a -> IO ()
    storeRetrievePublish e = do
        store e
        e' <- retrieve (ident e)
        publish e'
    

    Here ident e has rich enough type information that we know that e' must be an a, even though we don't have an explicit type signature for it. (The signature on storeRetrievePublish is also optional; the one given here is the one inferred by GHC.) Finishing touches:

    main :: IO ()
    main = storeRetrievePublish (Identified 1 (User "Thomas Meier" "tm@meier.com"))
    

    If you don't want to define storeRetrievePublish explicitly, you can get away with this:

    main :: IO ()
    main = do
        let user = Identified 1 (User "Thomas Meier" "tm@meier.com")
        store user
        user' <- retrieve (ident user)
        publish user'
    

    ...but you can't unfold definitions much further: if you reduce ident user to just 1, you will have lost the tie between the type tag on the identifier used for store and for retrieve, and be back to your ambiguous type situation.