Consider a type of terms parameterized over a type of function symbols node
and a type of variables var
:
data Term node var
= VarTerm !var
| FunTerm !node !(Vector (Term node var))
deriving (Eq, Ord, Show)
instance Functor (Term node) where
fmap f (VarTerm var) = VarTerm (f var)
fmap f (FunTerm n cs) = FunTerm n (Vector.map (fmap f) cs)
instance Monad (Term node) where
pure = VarTerm
join (VarTerm term) = term
join (FunTerm n cs) = FunTerm n (Vector.map join cs)
This is a useful type, since we encode open terms with Term node Var
,
closed terms with Term node Void
, and contexts with Term node ()
.
The goal is to define a type of substitutions on Term
s in the most pleasant
possible way. Here's a first stab:
newtype Substitution (node ∷ Type) (var ∷ Type)
= Substitution { fromSubstitution ∷ Map var (Term node var) }
deriving (Eq, Ord, Show)
Now let's define some auxiliary values related to Substitution
:
subst ∷ Substitution node var → Term node var → Term node var
subst s (VarTerm var) = fromMaybe (MkVarTerm var)
(Map.lookup var (fromSubstitution s))
subst s (FunTerm n ts) = FunTerm n (Vector.map (subst s) ts)
identity ∷ Substitution node var
identity = Substitution Map.empty
-- Laws:
--
-- 1. Unitality:
-- ∀ s ∷ Substitution n v → s ≡ (s ∘ identity) ≡ (identity ∘ s)
-- 2. Associativity:
-- ∀ a, b, c ∷ Substitution n v → ((a ∘ b) ∘ c) ≡ (a ∘ (b ∘ c))
-- 3. Monoid action:
-- ∀ x, y ∷ Substitution n v → subst (y ∘ x) ≡ (subst y . subst x)
(∘) ∷ (Ord var)
⇒ Substitution node var
→ Substitution node var
→ Substitution node var
s1 ∘ s2 = Substitution
(Map.unionWith
(λ _ _ → error "this should never happen")
(Map.map (subst s1) (fromSubstitution s2))
((fromSubstitution s1) `Map.difference` (fromSubstitution s2)))
Clearly, (Substitution n v, ∘, identity)
is a monoid (ignoring the Ord
constraint on ∘
) and (Term n v, subst)
is a monoid action of
Substitution n v
.
Now suppose that we want to make this scheme encode substitutions
that change the variable type. This would look like some type SubstCat
that satisfies the module signature below:
data SubstCat (node ∷ Type) (domainVar ∷ Type) (codomainVar ∷ Type)
= … ∷ Type
subst ∷ SubstCat node dom cod → Term node dom → Term node cod
identity ∷ SubstCat node var var
(∘) ∷ (Ord v1, Ord v2, Ord v3)
→ SubstCat node v2 v3
→ SubstCat node v1 v2
→ SubstCat node v1 v3
This is almost a Haskell Category
, but for the Ord
constraints on ∘
.
You might think that if (Substitution n v, ∘, identity)
was a monoid before,
and subst
was a monoid action before, then subst
should now be a category
action, but in point of fact category actions are just functors (in this case,
a functor from a subcategory of Hask to another subcategory of Hask).
Now there are some properties we'd hope would be true about SubstCat
:
SubstCat node var Void
should be the type of ground substitutions.SubstCat Void var var
should be the type of flat substitutions.instance (Eq node, Eq dom, Eq cod) ⇒ Eq (SubstCat node dom cod)
should exist (as well as similar instances for Ord
and Show
).SubstCat node dom cod
.Substitution
implementation above.The simplest possible approach to writing SubstCat
would be to simply
generalize Substitution
:
newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type)
= SubstCat { fromSubstCat ∷ Map dom (Term node cod) }
deriving (Eq, Ord, Show)
Unfortunately, this does not work because when we run subst
it may be the
case that the term we are running substitution on contains variables that are
not in the domain of the Map
. We could get away with this in Substitution
since dom ~ cod
, but in SubstCat
we have no way to deal with these
variables.
My next attempt was to deal with this issue by also including a function of
type dom → cod
:
data SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type)
= SubstCat
!(Map dom (Term node cod))
!(dom → cod)
This causes a few problems, however. Firstly, since SubstCat
now contains a
function, it can no longer have Eq
, Ord
, or Show
instances. We cannot
simply ignore the dom → cod
field when comparing for equality, since the
semantics of substitution change depending on its value. Secondly, it is now
no longer the case that SubstCat node var Void
represents a type of ground
substitutions; in fact, SubstCat node var Void
is uninhabited!
Érdi Gergő suggested on Facebook that I use the following definition:
newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type)
= SubstCat (dom → Term node cod)
This is certainly a fascinating prospect. There is an obvious category for this
type: the Kleisli category given by the Monad
instance on Term node
. I am
not sure if this actually corresponds to the usual notion of substitution
composition, however. Unfortunately, this representation cannot have instances
for Eq
et al. and I suspect it could be very inefficient in practice, since
in the best case it will end up being a tower of closures of height Θ(n)
,
where n
is the number of insertions. It also doesn't allow computation of the
domain variable set.
Is there a sensible type for SubstCat
that fits my requirements? Can you prove
that one does not exist? If I give up having correct instances of Eq
, Ord
,
and Show
, is it possible?
There is an obvious category for this type: the Kleisli category given by the Monad instance on Term node. I am not sure if this actually corresponds to the usual notion of substitution composition, however.
It does correspond to that, and it's an all-around correct definition of parallel substitution for well-scoped terms. You may note that this substitution is total; this is a requirement if you want term substitution to be total, i. e. that it's a functor from the category of substitutions (Subst) to Set. For a simple example for why partial substitutions don't work, if you have an empty SubstCat node () Void
, then you would have to invent arbitrary closed terms when hitting VarTerm ()
in subst
- and closed terms don't even exist if you choose Void
for node
.
Hence, if you have Map dom (Term node cod)
then you have partial term substitution, i. e. a functor from Subst to the Kleisli category for Maybe
(disregarding for now formal complications from the Ord
constraints):
type Subst node i j = Map i (Term node j)
subst :: Ord i => Subst node i j -> Term node i -> Maybe (Term node j)
subst sub (VarTerm x) = Map.lookup x sub
subst sub (FunTerm f ts) = FunTerm f <$> traverse (subst sub) ts
Now, total term substitution along with your 1-5 desiderata for SubstCat
is possible, but not with the current kinds for SubstCat
and Term
. As I mentioned, in this case substitutions must be total, but currently we can just have SubstCat node dom cod
with some infinite dom
, which makes equality of substitutions undecidable.
So let's switch to a more precise formalization, which contains only what we care about here. We have well-scoped untyped terms, and we assume that terms are finite and live in finite variable contexts.
Untyped terms imply that only sizes of variable contexts matter. So, Subst has natural numbers as objects:
data Nat = Z | S Nat deriving (Eq, Show)
Terms are indexed by n :: Nat
, containing n
possible values for variables:
-- {-# language TypeInType, GADTs, TypeFamilies, ScopedTypeVariables,
-- FlexibleInstances, MultiParamTypeClasses, RankNTypes,
-- TypeApplications, StandaloneDeriving #-}
data Fin n where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
deriving instance Eq (Fin n)
deriving instance Ord (Fin n)
deriving instance Show (Fin n)
data Term node (n :: Nat) where
VarTerm :: !(Fin n) -> Term node n
FunTerm :: !node -> !(Vector (Term node n)) -> Term node n
deriving instance Eq node => Eq (Term node n)
deriving instance Ord node => Ord (Term node n)
deriving instance Show node => Show (Term node n)
Substitutions (morphisms) are vectors of terms:
data Vec a n where
Nil :: Vec a Z
(:>) :: a -> Vec a n -> Vec a (S n)
infixr 5 :>
deriving instance Eq a => Eq (Vec a n)
deriving instance Ord a => Ord (Vec a n)
deriving instance Show a => Show (Vec a n)
type Subst node i j = Vec (Term node j) i
Term substitution is as follows:
subst :: Subst node i j -> Term node i -> Term node j
subst (t :> _ ) (VarTerm FZ) = t
subst (_ :> ts) (VarTerm (FS x)) = subst ts (VarTerm x)
subst sub (FunTerm f ts) = FunTerm f (Vector.map (subst sub) ts)
Composition is just pointwise subst
:
comp :: Subst node i j -> Subst node j k -> Subst node i k
comp Nil sub2 = Nil
comp (t :> ts) sub2 = subst sub2 t :> comp ts sub2
Identity substitution is not so easy. We need to dispatch on a type-level Nat
. For this, we use here a type class; singletons
would be a heavier but more principled solution. The implementation itself is also a bit mind-bending. It makes use of the fact that Subst node n m
is isomorphic to (Fin n -> Term node m)
. In fact, we could just use the functional representation all the time, but then the Eq, Ord and Show
instances would still effectively need the backwards conversion, and we would not have the spacetime boundedness guarantees of (strict) vectors.
class Tabulate n where
tabulate :: (Fin n -> Term node m) -> Subst node n m
instance Tabulate Z where
tabulate f = Nil
instance Tabulate n => Tabulate (S n) where
tabulate f = f FZ :> tabulate (f . FS)
idSubst :: Tabulate n => Subst node n n
idSubst = tabulate VarTerm
That's it! Proofs of category laws for Subst
and functor laws for subst
are left as exercise.
PS: I note that in literature and formal Agda/Coq developments, Nat
indices of Subst
are usually in swapped order, and subst
has contravariant action.