Search code examples
haskelltemplate-haskell

Template Haskell do variable assignment clarification


I am looking at the following TH code, and have several questions marked by --:

newtype C a = C ExpQ

unC (C x) = x   --what is the significance of this

clift1 :: ExpQ -> C t -> C a  --why are the C's here followed by t and a
clift1 g (C x) = C $ do f <- g  --is the g an ExpQ? What is the f here, what does it signify?
                        tx <- x --what is the tx here signify?
                        return $ AppE f tx  

As the title suggests, I am especially unsure where the f and tx come from and what they represent here. I think they are just variables being assigned values so that they can be used as arguments to AppE, is this correct? But in this case, why? Why not just simply call a function with these arguments instead of this whole clift1 do function?


Solution

  • An ExpQ is a monadic value (in the Q monad) that gives a Template Haskell representation of some Haskell expression. The underlying expression could have any type, so you could have an ExpQ that represents a boolean expression, which you could create using TH quotation:

    {-# LANGUAGE TemplateHaskell #-}
    
    import Language.Haskell.TH
    
    boolExpr :: ExpQ
    boolExpr = [| True && False |]
    

    or by directly building it up from TH primitives:

    -- almost the same as "boolExpr" above
    boolExpr' :: ExpQ
    boolExpr' = infixE (Just (conE (mkName "True"))) 
                       (varE (mkName "&&")) 
                       (Just (conE (mkName "False")))
    

    You could also have an ExpQ that represents an integer expression:

    intExpr :: ExpQ
    intExpr = [| 2 + length "abc" |]
    

    Note that boolExpr and intExpr have the same type, ExpQ, even though they represent underlying Haskell expressions with different types (namely, Bool and Int). This is because Template Haskell provides an untyped representation of typed Haskell expressions.

    It can be helpful to work with a typed representation of Haskell expressions, where the type of the represented expression is "carried around" in the type of the representation. An easy way to do this is to define a newtype synonym for the ExpQ type that takes a so-called "phantom" type parameter:

    newtype C a = C ExpQ
    

    This defines a new type (or actually a whole collection of types, C Int, C Bool, etc.) whose values are just ExpQ values wrapped in a C constructor. This means that we can define:

    typedBoolExpr :: C Bool
    typedBoolExpr = C [| True && False |]
    
    typedIntExpr :: C Int
    typedIntExpr = C [| 2 + length "abc" |]
    

    where the ExpQ representations of expressions are annotated with the types of the underlying expression. Of course, nothing actually stops us from writing:

    badlyTypedBoolExpr :: C Bool
    badlyTypedboolExpr = C [| 2 + length "abc" |]
    

    where the actual expression type doesn't match the type annotation, so the compiler isn't enforcing the correctness of our type annotation, but if we use a little self-discipline to enforce the correctness ourselves at a few key points, the type system can help us keep the rest of our code base correct.

    With that background, we can tackle your questions:

    1. What is the significance of unC (C x) = x ?

      Nothing, really. It's just a field accessor that let's you pull the C constructor off a type-annotated C SomeType value to get the underlying untyped ExpQ back. It could have been more concisely written:

      newtype C a = C { unC :: ExpQ }
      

      It looks like unC isn't even used in the code snippet you posted.

    2. Why are the C's followed by t and a in:

      clift1 :: ExpQ -> C t -> C a
      

      The purpose of clift1 is to take an untyped representation of a function expression and a typed representation of an argument expression and return a typed representation of the expression representing the application of the first argument to the second argument. For example:

      foo = clift1 (varE (mkName "not")) typedBoolExpr
      

      Note that, in this example, the desired type annotation is Bool, because that's the type of the underlying expression not (True && False). However, in general, the result of applying a function to an argument will be some other type. So, clift1 is prepared to take an untyped representation of a function expression (suppose it actually has type t -> a), apply it to a typed representation of an argument expression of any caller-specified type t, and generate a resulting typed representation of the application expression of any caller-specified type a.

      Since the function expression's type isn't specified anywhere, the types t and a are arbitrary as far as clift1 is concerned. We need to hope that where clift1 is used, it's used in a type-sensible way, for example by attaching a type-correct type signature to the functions that use clift1:

      myNot :: C Bool -> C Bool
      myNot = clift1 (varE (mkName "not"))
      
    3. Is g an ExpQ? What do f and tx signify?

      Most TH processing takes place in the Q monad. It allows you to do things like report errors during compilation, generate unique names, and a bunch of other things. The purpose of the do syntax and all these extra variables in clift1 is to construct the application expression within the Q monad. Clearly, clift1 was written by someone who not only had a terrible sense of what constitutes an informative function name, but also by someone unskilled in TH. It can more succinctly be written:

      clift1' :: ExpQ -> C t -> C a
      clift1' f (C x) = C (appE f x)
      

      which makes it clear that it's just constructing a function application expression in the Q monad with appropriate wrapping and unwrapping with the C type-annotation constructor.

    To sum up, clift1 operates in the Q monad to take an untyped representation (i.e., a value of type ExpQ) of a function of actual type t -> a, applies it to a typed representation (i.e., a value of type C t) of an argument of actual type t, and constructs the typed representation (i.e., a value of type C a) of the application of the function to the argument which will have actual type a.