Search code examples
haskelloptimizationcoerce

Should I always use coerce?


From what I understand,

  1. It is always possible to replace wrapping and unwrapping newtypes with the coerce function from Data.Coerce.
  2. It is always faster (maybe sometimes equally fast ?)

However, in many example, it feels very artificial and less readable. Here is an example :

import Data.Semigroup

class Semigroup s => LAction s x where
  (<>$) :: s -> x -> x

instance Num x => LAction (Sum x) x where
  s <>$ x =  coerce (coerce s + x)

In my opinion it's pretty hard to read. And it can get way worse, as I often need to add some type annotations.

So my questions are :

  1. Are there some cases where I can avoid using coerce and be sure that it will not be slower ?
  2. Are there some GHC optimizations so that Haskell does coercion ?

Solution

  • There are various conveniences I have seen library authors use to make coerce look nicer.

    First, as Daniel Wagner's answer notes, you never have to replace newtype constructors/record fields with coerce. In Core (the first intermediate language GHC translates Haskell into, where most high-level optimizations operate), newtype wrappers and unwrappers don't actually exist. As the surface language is desugared, they are immediately replaced with coerce.

    Second, libraries may provide RULES that translate certain expressions to coerce. For example, map :: (a -> b) -> [a] -> [b] enjoys the following rule:

    {-# RULES "map/coerce" [1] map coerce = coerce #-}
    

    Combined with the point above, this means you can just write e.g. map Sum instead of coerce. (Sum immediately simplifies to coerce, and then the above rule turns the resulting map coerce into coerce.)

    There are certain higher order functions that don't have such rules. For example, at first glance, it seems like coerce . f = coerce f and f . coerce = coerce f should be admissible as RULES. However, such RULES would not quite preserve semantics: we have undefined . coerce /= undefined but coerce undefined = undefined. That is, this optimization would make some programs (albeit weird ones) less defined, which is something at least the GHC library authors like to avoid. Instead of writing stuff like someFunction . Sum and relying on optimizations, I have instead seen libraries define functions such as the following:

    (.#) :: Coercible a b => (b -> c) -> (a -> b) -> a -> c
    f .# _ = coerce f
    
    (#.) :: Coercible b c => (b -> c) -> (a -> b) -> a -> c
    _ #. f = coerce f
    

    This way, you can write expressions like someFunction .# Sum (the operator names have been chosen so the # faces the coercion), and this will eventually compile to just coerce someFunction.

    The overall recipe for using coerce nicely is thus to avoid using coerce directly whenever possible. In a monomorphic context, instead of writing coerce, write the newtype constructor/field you want directly. Write/rely on RULES to make higher order functions aware of coerce where possible, e.g. write a mapMyStructure coerce = coerce rule when you define a new Functor (where fmap = mapMyStructure). Define helper combinators like (#.) and (.#) that let you hide uses of coerce while allowing you to specify the types involved.

    If all else fails, you can always put a type signature on a coerce to clarify what it is doing, or write a local/private definition. (coerce :: [Int] -> [Sum Int]) [1, 2, 3], let { con :: [Int] -> [Sum Int]; con = coerce } in con [1, 2, 3].