Search code examples
haskellphantom-types

Motivation behind Phantom Types?


Don Stewart's Haskell in the Large's presentation mentioned Phantom Types:

data Ratio n = Ratio Double
1.234 :: Ratio D3

data Ask ccy = Ask Double
Ask 1.5123 :: Ask GBP

I read over his bullet points about them, but I did not understand them. In addition, I read the Haskell Wiki on the topic. Yet I still am missing their point.

What's the motivation to use a phantom type?


Solution

  • To answer the "what's the motivation to use a phantom type". There is two points:

    • to make invalid states unrepresentable, which is well explained in Aadit's answer
    • Carry some of the information on the type level

    For example you could have distances tagged by the length unit:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    
    newtype Distance a = Distance Double
      deriving (Num, Show)
    
    data Kilometer
    data Mile
    
    marathonDistance :: Distance Kilometer
    marathonDistance = Distance 42.195
    
    distanceKmToMiles :: Distance Kilometer -> Distance Mile
    distanceKmToMiles (Distance km) = Distance (0.621371 * km)
    
    marathonDistanceInMiles :: Distance Mile
    marathonDistanceInMiles = distanceKmToMiles marathonDistance
    

    And you can avoid Mars Climate Orbiter disaster:

    >>> marathonDistanceInMiles
    Distance 26.218749345
    
    >>> marathonDistanceInMiles + marathonDistance
    
    <interactive>:10:27:
        Couldn't match type ‘Kilometer’ with ‘Mile’
        Expected type: Distance Mile
          Actual type: Distance Kilometer
        In the second argument of ‘(+)’, namely ‘marathonDistance’
        In the expression: marathonDistanceInMiles + marathonDistance
    

    There are slight varitions to this "pattern". You can use DataKinds to have closed set of units:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    {-# LANGUAGE KindSignatures #-}
    {-# LANGUAGE DataKinds #-}
    
    data LengthUnit = Kilometer | Mile
    
    newtype Distance (a :: LengthUnit) = Distance Double
      deriving (Num, Show)
    
    marathonDistance :: Distance 'Kilometer
    marathonDistance = Distance 42.195
    
    distanceKmToMiles :: Distance 'Kilometer -> Distance 'Mile
    distanceKmToMiles (Distance km) = Distance (0.621371 * km)
    
    marathonDistanceInMiles :: Distance 'Mile
    marathonDistanceInMiles = distanceKmToMiles marathonDistance
    

    And it will work similarly:

    >>> marathonDistanceInMiles
    Distance 26.218749345
    
    >>> marathonDistance + marathonDistance
    Distance 84.39
    
    >>> marathonDistanceInMiles + marathonDistance
    
    <interactive>:28:27:
        Couldn't match type ‘'Kilometer’ with ‘'Mile’
        Expected type: Distance 'Mile
          Actual type: Distance 'Kilometer
        In the second argument of ‘(+)’, namely ‘marathonDistance’
        In the expression: marathonDistanceInMiles + marathonDistance
    

    But now the Distance can be only in kilometers or miles, we can't add more units later. That might be useful in some use cases.


    We could also do:

    data Distance = Distance { distanceUnit :: LengthUnit, distanceValue :: Double }
       deriving (Show)
    

    In the distance case we can work out the addition, for example translate to kilometers if different units are involved. But this doesn't work well for currencies which ratio isn't constant over time etc.


    And it's possible to use GADTs for that instead, which may be simpler approach in some situations:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    {-# LANGUAGE KindSignatures #-}
    {-# LANGUAGE DataKinds #-}
    {-# LANGUAGE GADTs #-}
    {-# LANGUAGE StandaloneDeriving #-}
    
    data Kilometer
    data Mile
    
    data Distance a where
      KilometerDistance :: Double -> Distance Kilometer
      MileDistance :: Double -> Distance Mile
    
    deriving instance Show (Distance a)
    
    marathonDistance :: Distance Kilometer
    marathonDistance = KilometerDistance 42.195
    
    distanceKmToMiles :: Distance Kilometer -> Distance Mile
    distanceKmToMiles (KilometerDistance km) = MileDistance (0.621371 * km)
    
    marathonDistanceInMiles :: Distance Mile
    marathonDistanceInMiles = distanceKmToMiles marathonDistance
    

    Now we know the unit also on the value level:

    >>> marathonDistanceInMiles 
    MileDistance 26.218749345
    

    This approach especially greately simplifies Expr a example from Aadit's answer:

    {-# LANGUAGE GADTs #-}
    
    data Expr a where
      Number     :: Int -> Expr Int
      Boolean    :: Bool -> Expr Bool
      Increment  :: Expr Int -> Expr Int
      Not        :: Expr Bool -> Expr Bool
    

    It's worth pointing out that the latter variations require non-trivial language extensions (GADTs, DataKinds, KindSignatures), which might not be supported in your compiler. That's might be the case with Mu compiler Don mentions.