Search code examples
haskelltypestype-signaturescrap-your-boilerplate

Understanding the type signature of gfoldl from Data.Data.Data


Data defines as one of its core functions gfoldl:

gfoldl
  :: (Data a)
  => (forall d b. Data d => c (d -> b) -> d -> c b) 
  -> (forall g. g -> c g)   
  -> a  
  -> c a

What's the purpose of c and c (d -> b) in it? Why isn't it just a regular fold, something like

gfoldl'
  :: (Data a)
  => (forall d. Data d => r -> d -> r)
  -> r
  -> a  
  -> r

Solution

  • The idea is that a value of an algebraic datatype in Haskell has the form

    C x_1 x_2 ... x_n
    

    where C is a constructor and the x_i are the arguments. What

    gfoldl app con
    

    does is to turn such a value into

    con C `app` x_1 `app` x_2 ... `app` x_n
    

    thereby turning an a into a c a. Let's assume the type of C is

    C :: T_1 -> T_2 -> ... -> T_n -> D
    

    then let's look at the types of the intermediate expressions:

    con C                                   :: c (T_1 -> T_2 -> ... -> T_n -> D)
    con C `app` x_1                         :: c (T_2 -> ... -> T_n -> D)
    con C `app` x_1 `app` x_2               :: c (... -> T_n -> D)
    con C `app` x_1 `app` x_2 ... `app` x_n :: c D
    

    The parameterization over c allows all these intermediate types to be different. If we'd use a simple fold such as gfoldl' instead, then all these intermediate types would have to be the same.

    The motivation for gfoldl is to be a single generalization that lets you express the SYB functions gmapQ and gmapT (and a few others). The types of gmapQ and gmapT are:

    gmapQ :: Data a => (forall d. Data d => d -> u) -> a -> [u]
    gmapT :: Data a => (forall b. Data b => b -> b) -> a -> a
    

    While gmapQ collapses an a into a uniform list of us and could be expressed using gfoldl', this wouldn't be possible for gmapT.

    However, with gfoldl, we can use c = Identity to enable us to get something like gmapT, and c = Const to get something like gmapQ.

    For more details, you may also want to look at the paper Scrap your boilerplate Reloaded which shows that gfoldl is an ordinary (yet higher-order) fold of a datatype that is called Spine in that paper.

    The use of the identity and constant functors to obtain both transforming and updating behaviour from a single underlying representation has some similarity to how you obtain the lens operations from "van Laarhoven" lenses.