Motivation for limitation on data kind promotion

Could anyone explain or guess the motivation behind the limit on data type promotion discussed in section 7.9.2 of the GHC user guide?

The following restrictions apply to promotion:

We only promote datatypes whose kinds are of the form * -> ... -> * -> *. In particular, we do not promote higher-kinded datatypes such as data Fix f = In (f (Fix f)), or datatypes whose kinds involve promoted types such as Vec :: * -> Nat -> *.

In particular, I am interested in the last bit about promoted types such as Vec :: * -> Nat -> *. Promoting some types like that seems natural. I ran into it pretty quickly while trying to convert one of my libraries to use specific promoted kinds for the various phantom types instead of using kind * for everything, to provide better documentation and such.

Oftentimes the reasons for compiler limitations like these jump out at you with a little thinking, but I'm not seeing this one. So I'm wondering if it comes in the category of "not needed yet, so we didn't build it" or "that's impossible/undecidable/destroys inference."

Solution

An interesting thing happens if you promote types indexed by promoted types. Imagine we build

data Nat = Ze | Su Nat

and then

data Vec :: Nat -> * -> * where
  VNil   :: Vec Ze x
  VCons  :: x -> Vec n x -> Vec (Su n) x

Behind the scenes, the internal types of the constructors represent the instantiated return indices by constraints, as if we had written

data Vec (n :: Nat) (a :: *)
  =            n ~ Ze    => VNil
  | forall k.  n ~ Su k  => VCons a (Vec k a)

Now if we were allowed something like

data Elem :: forall n a. a -> Vec n a -> * where
  Top :: Elem x (VCons x xs)
  Pop :: Elem x xs -> Elem x (VCons y xs)

the translation to internal form would have to be something like

data Elem (x :: a) (zs :: Vec n a)
  = forall (k :: Nat), (xs :: Vec k a).            (n ~ Su k, zs ~ VCons x xs) =>
      Top
  | forall (k :: Nat), (xs :: Vec k s), (y :: a).  (n ~ Su k, zs ~ VCons y xs) =>
      Pop (Elem x xs)

but look at the second constraint in each case! We have

zs :: Vec n a

but

VCons x xs, VCons y xs :: Vec (Su k) a

But in System FC as then defined, equality constraints must have types of the same kind on both sides, so this example is not inconsiderably problematic.

One fix is use the evidence for the first constraint to fix up the second, but then we'd need dependent constraints

(q1 :: n ~ Su k, zs |> q1 ~ VCons x xs)

Another fix is just to allow heterogeneous equations, as I did in dependent type theory fifteen years ago. There will inevitably be equations between things whose kinds are equal in ways which are not syntactically obvious.

It's the latter plan that is currently favoured. As far as I understand, the policy you mention was adopted as a holding position, until the design for a core language with heterogeneous equality (as proposed by Weirich and colleagues) has matured to implementation. We live in interesting times.