Search code examples
haskellghcaesonexistential-type

Does GHC allow newtypes to have existential type variables?


In the GHC manual, section 6.4.6 Existentially quantified data constructors, it says (second to last bullet point on the page), “You can’t use existential quantification for newtype declarations.”

In Data.Aeson.Types (from aeson-2.2.3.0) the following declaration is made:

newtype Parser a = Parser {
      runParser :: forall f r.
                   JSONPath
                -> Failure f r
                -> Success a f r
                -> f r
    }

When I create the same definition locally, stack v3.1.1 with GHC v9.6.6 compiles it without complaint, but flycheck using haskell-stack-ghc reports that “A newtype constructor must not have existential type variables.…[GHC-07525]”

Is this supposed to be allowed or not?


Solution

  • The section of the manual you've linked to is for the ExistentialQuantification extension, but the code you're talking about doesn't actually use that syntax. Rather it's using RankNTypes (technically Rank2Types is the extension that Data.Aeson.Types.Internal actually enables, but these days that's a deprecated synonym).

    Note that the manual's examples of existential types look like this:

    data Foo = forall a. MkFoo a (a -> Bool)
             | Nil
    

    The forall introducing the existential type variable occurs on the right-hand-side, but outside the data constructor MkFoo.

    Whereas the Parser example from Aeson looks like this:

    newtype Parser a = Parser {
          runParser :: forall f r.
                       JSONPath
                    -> Failure f r
                    -> Success a f r
                    -> f r
        }
    

    The forall is not in the same position, outside the constructor. It occurs inside the constructor, within the type of one of the fields (of which there only is one, because it's a newtype).

    Why the difference? Well (again, on the manual page you site) with the existential syntax the type of the MkFoo constructor ends up being:

    MkFoo :: forall a. a -> (a -> Bool) -> Foo
    

    While the type of the Parser constructor is:

    Parser
      :: forall a.
         (forall (f :: * -> *) r. JSONPath -> Failure f r -> Success a f r -> f r)
         -> Parser a
    

    Here the type variables f and r are quantified inside the left side of the outermost arrow (what higher-rank polymorphism means is having type variables quantified within the left side of an arrow). So the thing that's stored in the Parser constructor is a function that is polymorphic over f and r; whoever extracts and calls that function is the one who gets to choose what type to instantiate for f and r. Whoever applied the Parser constructor did not get to choose f and r, they specifically have to supply a function that can work for any choice of f and r.

    Whereas with the existential MkFoo constructor, to apply it you need to choose a specific type for a, and then that choice is hidden away (because a does not appear in the type of the resulting Foo value). Whoever extracts the contents of a Foo does not get to choose a, they specifically have to work for any choice of a.

    (The a type variable in the Parser example is quantified over the whole constructor, but also appears in the final resulting type, so it's neither existential nor higher-rank, just a "normal" type variable)

    So you see that where you choose to place the forall has quite different effects! It's a matter of the scope of the introduced type variables. Existential type variables don't work with newtype. Higher-rank types OTOH pose no problem for newtype, so there's no restriction there.


    Since you asked in a comment, the reason newtype doesn't support existential type variables is that it can't work when the existential variables have typeclass constraints (and existential variables extremely frequently need type class constraints to do anything useful, especially when they can only be used in a single field). Storing a value of an unknown type but knowing that it supports a type class interface requires storing the instance dictionary. Pattern matching on is supposed to provide the dictionary in order to allow calling the type class methods, but the pattern matching code has no knowledge of what the type is in order to go and find an instance dictionary externally, so it must be stored in data type using the existential type variable.

    When you've got a regular data constructor that supports an arbitrary number of fields, the compiler can just add an extra "hidden" field to store the dictionary, no problem. But a newtype is supposed to be represented at runtime identically to its single field; we have no wrapper constructor in which we could add a hidden field (the newtype constructor appears only in the source code, not at runtime; that's basically the whole point). One could theoretically support this by making a newtype with existential type variables actually represented with a runtime constructor after all, but that basically turns it into a regular data type. The compiler authors decided it was better to error out in that case, rather than silently give the user data when they asked for newtype. One could also theoretically impose the restriction on the combination of newtype and existential type variables only when they have constraints, but so far the implementation has been to simply disallow the use of existential type variables with newtype entirely.


    So why is flycheck showing you an error about existential type variables when this code is only using higher-rank polymorphism? I wasn't sure, so I tried a minimal file that would let me compile the definition of Parser:

    data JSONPath
    
    data Failure f r
    
    data Success a f r
    
    newtype Parser a = Parser {
          runParser :: forall f r.
                       JSONPath
                    -> Failure f r
                    -> Success a f r
                    -> f r
        }
    

    And sure enough, I got the newtype constructor must not have existential type variables error!! But look at the full error message more closely:

    GHCi, version 9.6.6: https://www.haskell.org/ghc/  :? for help
    [1 of 2] Compiling Main             ( fails.hs, interpreted )
    
    fails.hs:7:20: error: [GHC-07525]
        • A newtype constructor must not have existential type variables
          Parser :: forall {k1} (a :: k1) {k2}.
                    (forall (f :: k2 -> *) (r :: k2).
                     JSONPath -> Failure f r -> Success a f r -> f r)
                    -> Parser a
        • In the definition of data constructor ‘Parser’
          In the newtype declaration for ‘Parser’
      |
    7 | newtype Parser a = Parser {
      |                    ^^^^^^^^...
    

    Because I never actually use the type parameters of my dummy definitions of Failure and Succcess, GHC 9.6.6 has inferred (with default settings) that a, f, and r are poly-kinded. Which means there are two extra type variables involved in their kinds: k1 and k2. Those type variables have to be implicitly quantified somewhere for that to make sense, and they've ended up in the outer forall, scoped over the constructor's entire type, not just the first field. k1 isn't really a problem as it's the kind of a, so it's implicitly present in the result type Parser a, but k2 is existential, so we get the error.

    It might seem surprising and confusing that GHC has introduced an invalid existential type variable for us, but it's unfortunately to be expected. The compiler cannot in general infer higher rank types for us (it can check higher-rank types that we provide, but not infer them from our code). The code I wrote has an implicit type variable k2, and is ambiguous about where it should be quantified. When GHC infers types the rule is that it always infers non-higher-rank types; the rule is that higher-rank types must always be explicitly provided by the programmer.

    If I provide some kind signatures so that all the type variables have the kinds they would in the full Aeson code, then it compiles, because nothing is poly-kinded so no extra type variables need to be inferred and quantified:

    import Data.Kind (Type)
    
    type JSONPath :: Type
    data JSONPath
    
    type Failure :: (Type -> Type) -> Type -> Type
    data Failure f r
    
    type Success :: Type -> (Type -> Type) -> Type -> Type
    data Success a f r
    
    newtype Parser a = Parser {
          runParser :: forall f r.
                       JSONPath
                    -> Failure f r
                    -> Success a f r
                    -> f r
        }
    

    So I can't say exactly why you're getting errors in flycheck but not when you compile yourself, but I can see how it's possible to get the type error you report from this code.

    It might depend on what exactly you're compiling (are you literally setting up a project environment for the full aeason package, or have you extracted a small example like mine? If the latter how did you define the dummy types; it will affect the inferred kinds in Parser). This behaviour is sensative to what extensions you have on (e.g. if the compiler doesn't decide to infer poly kinds for f and r then you won't get the error, but if it does then you will), and therefore to what version of GHC you're running (e.g. slightly older versions will be using Haskell2010 as the default language rather than GHC2021, and so have a different set of extensions enabled). Is haskell-stack-ghc that you say flycheck is running definitely the same version of GHC that you're using when you manually test? Does flycheck pick up the correct extensions (I haven't used emacs recently, but I remember at times having issues with editor tooling missing extensions enabled in my .cabal file rather than each individual file's header, for example). Those are the kinds of things I'd be checking.