# Michael Shulman categories of variables

Categories of variables

# Categories of variables

## Categories of variables

### Definition

A category of variables consists of:

• A full subcategory $\mathbb{F}$ of FinSet,
• $\emptyset\in\mathbb{F}$.
• For each $V\in \mathbb{F}$, an inhabited set of fresh extensions, each of which is an injection $i:V\hookrightarrow W$ in $\mathbb{F}$ whose complement has cardinality $1$ (“an inclusion of cocardinality 1”). Given a fresh extension $i:V\to W$, we will refer to the unique element $x\in W\setminus i(V)$ as the fresh name.
• For each function $\sigma:V\to U$ in $\mathbb{F}$ and fresh extension $i:V\to W$, a specified pushout, called the fresh pushout, consisting of a function $\tau:W\to Z$ and a fresh extension $j:U\to Z$.
• For any injection $\sigma:U\hookrightarrow V$ in $\mathbb{F}$ of cocardinality 1 (not necessarily a fresh extension), and any fresh extension $i:V\to W$, a specified pushout complement, called the fresh pushout complement, consisting of a fresh extension $j:U\to Z$ and a function $\tau:Z\to W$ (necessarily also an injection of cocardinality 1).

### Remarks

From the second and third axioms, it follows by mathematical induction that $\mathbb{F}$ contains at least one set of each finite cardinality; thus the inclusion functor $\mathbb{F}\hookrightarrow FinSet$ is an equivalence of categories.

The intuition is that each object of $\mathbb{F}$ is a possible “set of free variables” that may occur in a term. In a moment we will define the set of “terms with free variables in $V$” for any $V\in \mathbb{F}$. The fresh extensions are the allowable ways to “add a new free variable”, such as when extending a context or passing under a binder. In particular, this means that in practice the only objects of $\mathbb{F}$ that will be relevant are those obtainable from $\emptyset$ by iterated fresh extension.

Fresh pushouts tell us how to transform “new free variables” under structural rules such as weakening, exchange, and contraction applied to the previous free variables. Similarly, fresh pushout complements tell us how to transform “new free variables” under substitution for a previously free variable. This will become more clear later. Note that just as $FinSet$ has all pushouts, it also has all pushout complements of (complemented) injections; thus the content of the fresh pushouts and fresh pushout complements is not that they exist, but that they are specified and that the morphism across from the given fresh extension is also a fresh extension.

### Examples

First we have the tautological example.

###### Example

Let $\mathbb{F}=FinSet$, let every injection with cocardinality 1 be a fresh extension, and choose pushouts and pushout complements arbitrarily.

Now a couple of examples using “named variables”. For both of them, let $Name$ be any infinite set (the “variable names”) equipped with an operation $fr : P_{fin}(Name) \times Name \to Name$ such that $fr(V,x) \neq x$ and $fr(V,x)\notin V$, i.e. a way to “freshen” a given variable name with respect to a given finite set of existing variables.

###### Example

Let $\mathbb{F}$ consist of all finite subsets of $Name$. The fresh extensions are the literal subset inclusions (not mere injections) with cocardinality 1.

This category of variables roughly represents the use of named variables with a “local Barendregt convention” that the name of any bound variable differs from all other free variables in scope at the point of binding (though distinct bound variables with non-overlapping scopes might still have the same name). The issue in this approach is to ensure that the convention is maintained under weakening: adding a new free variable might force us to rename a bound variable to avoid conflict.

In our abstract setting, this renaming is accomplished by the definition of fresh pushout. Given a fresh extension $i:V\to W$ with fresh name $x$ and a function $\sigma:V\to U$, if $x\notin U$ then the fresh pushout has vertex $U\cup \{x\}$, while if $x\in U$ then it has vertex $U \cup \{fr(U,x)\}$. The definition of fresh pushout complement is similar.

###### Example

Let $\mathbb{F}$ consist again of all finite subsets of $Name$, but now let the fresh extensions of $V$ consist of the subset inclusions with cocardinality 1 together with the maps of the form

$V = (V\setminus \{x\}) \cup \{x\} \cong (V\setminus \{x\}) \cup \{fr(V\setminus\{x\},x)\} \hookrightarrow (V\setminus \{x\}) \cup \{fr(V\setminus\{x\},x)\} \cup \{x\}$

for any $x\in V$.

This category of variables roughly represents a different way of using of named variables: a bound variable is allowed to have the same name as a variable already in the context, but instead of literally “shadowing” the latter, the latter is given a modified name (such as $x_1$ or $outer.x$) only in the scope of the new binder. The fresh extensions that are not subset inclusions “carry along the information” about this renaming, so that in particular when substituting inside the scope of that binder any references to the outer free variable can automatically be suitably renamed (to avoid variable capture).

In this example, given a fresh extension $i:V\to W$ with fresh name $x$ and a function $\sigma:V\to U$, if $x\notin U$ then the new fresh extension is the subset inclusion $U\hookrightarrow U\cup \{x\}$, while if $x\in U$ then it is of the other form determined by $U$ and $x$. The definition of fresh pushout complements is similar.

Inspecting these examples and comparing them to the general definition, we can conclude that in general a fresh extension can include information about how both a variable and a given finite set of variables must be renamed in order to make the former fresh for the latter. The same point is made by the de Bruijn examples:

###### Example

Let $\mathbb{F}$ consist of all sets $[n] = \{1,\dots,n\}$, with the unique fresh extension of $[n]$ being $(\lambda m.m+1) : [n] \to [n+1]$ (so the fresh name is always $\{1\}$). Since every object has a unique fresh extension, fresh pushouts and fresh pushout complements are uniquely determined. This corresponds to de Bruijn indices: newly bound variables are denoted $1$, with all existing variable numbers incremented. The fresh extensions $(\lambda m.m+1)$ again carry along the information of how free variables in terms being substituted must be modified when passing under a binder.

###### Example

Let $\mathbb{F}$ consist again of all sets $[n] = \{1,\dots,n\}$, but now the unique fresh extension of $[n]$ is the inclusion $[n] \hookrightarrow [n+1]$ (for which the fresh name is $\{n+1\}$). Again, since every object has a unique fresh extension, fresh pushouts and fresh pushout complements are uniquely determined. This corresponds to “de Bruijn levels”.

### Iterated fresh extensions

We refer to a composable sequence of fresh extensions $V \xrightarrow{i^1} V^1 \to \cdots \xrightarrow{i^n} V^n$ as an $n$-ary fresh extension $i:V\to V^n$. (Note that an $n$-ary fresh extension $i$ is by definition the list $(i^1,\dots,i^n)$, not its composite $(i^n\circ\cdots\circ i^1) :V\to V^n$, although we will sometimes treat $i \mapsto (i^n\circ\cdots\circ i^1)$ as an implicit coercion.) Since pushout squares compose, we can say that every $n$-ary fresh extension has a fresh pushout along any function which is again an $n$-ary fresh extension, and similarly any $n$-ary fresh extension has a fresh pushout complement relative to any injection of cocardinality 1 which is again an $n$-ary fresh extension.

## Basic definitions

### Sorts and operators

We will work at the generality of the abstract binding trees of Harper (Practical foundations of programming languages). Thus, we suppose given a set $Sort$ of sorts, and also a set $Op$ of operators. Moreover, each operator is assigned a signature or generalized arity of the following form:

$((s_{1}^{1},\dots,s_{1}^{n_1}; s_1),\dots, (s_{m}^{1},\dots,s_{m}^{n_m}; s_m); s)$

where each $s,s_k,s_{k}^{l}$ is a sort. The meaning is that such a operator is of sort $s$ and takes $m$ arguments, where the $k^{th}$ argument binds $n_k$ arguments with sorts $s_{k}^{1},\dots, s_{k}^{n_i}$ and has sort $s_k$. Thus we would write an application of it as

$\mathtt{C}(x_{1}^{1}\dots x_{1}^{n_1}. M_1, \dots , x_{m}^{1}\dots x_{m}^{n_m}. M_m)$

Put differently, if we regard the sorts as the base types (zeroth-order types) in a simply typed lambda-calculus, then each signature is a second-order type: an iterated function type whose arguments are themselves iterated function types whose arguments are base types.

For instance, to represent the raw syntax of dependent type theory (with fully annotated terms), there would be two sorts $tm$ and $ty$, with operators such as $\Pi : ((;ty),(tm;ty);ty)$ and $\lambda : ((;ty),(tm;ty);(tm;tm);tm)$.

### Abstract binding trees

A sort context $E$ consists of an object $V\in \mathbb{F}$ and a function $E:V\to Sort$. We define by mutual induction sets $Tm_E(s)$ of terms, for a sort context $E$ and a sort $s$. The clauses are:

• If $x\in V$ and $E(x)=s$, then $x\in Tm_E(s)$.

• Let $\mathtt{C}$ be a operator of signature $((s_{1}^{1},\dots,s_{1}^{n_1}; s_1),\dots, (s_{m}^{1},\dots,s_{m}^{n_m}; s_m); s)$, and let $E:V\to Sort$ be a sort context. For each $1\le k\le m$, choose an $n_k$-ary fresh extension $i_k : V\to V_k^{n_k} = V_k$, and extend $E$ to $E_k : V_{k}\to Sort$ by assigning to the fresh name of each $i^l_k$ the sort $s_{k}^{l}$. Suppose furthermore we have $M_k \in Tm_{E_k}(s_k)$. Then we have a new term $\mathtt{C}(i_1.M_1,\dots,i_m.M_m) \in Tm_E(s)$.

### Structural action

Given any $E:V\to Sort$ and $s\in Sort$, we define for any $\sigma:V\to U$ and $F:U\to Sort$ with $E = F\circ \sigma$ an action

$\array{ Tm_{E}(s) & \to & Tm_{F}(s) \\ M & \mapsto & M[\sigma] }$

by recursion on $M$. (Note that the notation is technically ambiguous as the context $F$ is not specified. If $\sigma$ is surjective, then $F$ is uniquely determined by $E$, but in general this need not be.)

• If $x\in V$ with $E(x)=s$, then also $F(\sigma(x)) = E(x)=s$, so we can set $x[\sigma] = \sigma(x)$.

• Given $\mathtt{C}(i_1.M_1,\dots,i_m.M_m) \in Tm_E(s)$ as above, for each $k$ we choose an $n_k$-ary fresh pushout consisting of an $n_k$-ary fresh extension $j_k : U \to U_k$ and a function $\tau_k:V_k\to U_k$, and define $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)[\sigma] = \mathtt{C}(j_1.M_1[\tau_1],\dots,j_m.M_m[\tau_{m}])$. Here we extend the context $F:U\to Sort$ to $U_{k}$ using the same sorts $s_{k}^{l}$ in the signature of $\mathtt{C}$ that extend $E$ to $V_{k}$.

It is straightforward to prove by induction that $M[id_V] = M$ for any $M$. However, it will only be symmetric and transitive up to $\alpha$-equivalence.

### $\alpha$-equivalence

We usually think of $\alpha$-equivalence as a “homogeneous” binary relation $\sim$ on each set $Tm_E(s)$. However, the inductions are much cleaner if we first define it in “heterogeneous” form, as a family of relations $\sim_\rho$ between $Tm_E(s)$ and $Tm_{E'}(s)$ for any bijection $\rho:V\cong V'$ such that $E' \circ \rho = E$. These relations are defined inductively as follows.

• For any $x\in V$ with $E(x)=s$, we have $x \sim_\rho \rho(x)$. (Note that by assumptions on $\rho$, it is equivalent to say that for any $y\in V'$ with $E'(y)=s$ we have $\rho^{-1}(y)\sim_\rho y$.)

• Suppose given $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)$ in $Tm_E(s)$ and $\mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$ in $Tm_{E'}(s)$. Note that if $i:V\to W$ and $i':V'\to W'$ are two fresh extensions and $\rho:V\cong V'$ is a bijection, then there is a unique bijection $\rho':W\to W'$ such that $\rho' \circ i = i'\circ \rho$ and $\rho'$ preserves the fresh names. Applying this repeatedly for each $k$, we obtain a sequence of bijections $\rho^l_k$ starting with $\rho^0_k = \rho : V \cong V'$ and ending with $\rho_k = \rho^{n_k}_k : V_{k} \cong V'_{k}$. If $M_k \sim_{\rho_k} M'_k$, then we assert $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)\sim \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$.

I conjecture that if the inductive family $Tm$ is defined in homotopy type theory where $\mathbb{F}$ is a univalent category, then this definition of $\alpha$-equivalence coincides with their ordinary (dependent) equality types.

It is straightforward to prove that each homogeneous relation $\sim_{id}$ is reflexive, and they are “congruences” for all the operators essentially by definition. Note that in general, the definition of $\sim_{id}$ at operators with binding involves $\sim_\rho$ for non-identity $\rho$. However, this is not the case in the de Bruijn examples where every object has a unique fresh extension, and indeed in this case it is not hard to see that $\sim_{id}$ reduces to simple equality, as expected.

We now prove successively that they are congruences for the structural rules, are functorial, and are equivalence relations.

## Basic properties

### $\alpha$-equivalence is symmetric

We prove by induction on $\sim$ that if $M\sim_\rho M'$ then $M'\sim_{\rho^{-1}} M$.

• In the case $x\sim_\rho \rho(x)$, we have $\rho(x) \sim_{\rho^{-1}} \rho^{-1}(\rho(x)) = x$.

• Suppose $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)\sim_{\rho} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$ because $M_k \sim_{\rho_{k}} M'_k$ for all $k$, as in the definition. Then the inverses $(\rho^{-1})_{k}^l$ are the comparison maps in the other direction from $i'_{k}$ to $i_{k}$, and by the inductive hypothesis we have $M'_k \sim_{\rho^{-1}_k} M_k$ for all $k$, which is what we need to conclude $\mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m) \sim_{\rho^{-1}} \mathtt{C}(i_1.M_1,\dots,i_m.M_m)$

### $\alpha$-equivalence is transitive

We prove by induction on $\sim$ that if $M\sim_\rho M'$ and $M' \sim_{\rho'} M''$ then $M\sim_{\rho'\circ \rho} M''$.

• In the case $x\sim_\rho \rho(x)$ and $\rho(x) \sim_{\rho'} \rho'(\rho(x))$, we have $x \sim_{\rho'\circ \rho} (\rho'\circ \rho)(x) = \rho'(\rho(x))$.

• Suppose $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)\sim_{\rho} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$ because $M_k \sim_{\rho_{k}} M'_k$ for all $k$, and similarly $\mathtt{C}(i'_1.M'_1,\dots,i'_m.M'_m)\sim_{\rho'} \mathtt{C}(i''_1,M''_1,\dots,i''_m,M''_m)$ because $M'_k \sim_{\rho'_{k}} M''_k$ for all $k$. Then by the inductive hypothesis, we have $M_k \sim_{\rho'_k \circ \rho_k} M''_k$, and the $\rho'_k^l \circ \rho_k^l$ are the comparison maps for $\rho'\circ\rho$. Thus this is what we need to conclude $\mathtt{C}(i_1,M_1,\dots,i_m,M_m) \sim_{\rho'\circ \rho} \mathtt{C}(i''_1.M''_1,\dots,i''_m.M''_m)$.

### Structural actions respect $\alpha$-equivalence

Suppose given a commutative square $\sigma' \circ \rho = \rho'\circ \sigma$, where $\rho,\rho'$ are bijections. We prove by induction on $\sim$ that if $M\sim_{\rho} M'$ then $M[\sigma]\sim_{\rho'} M'[\sigma]$.

• In the case $x\sim_\rho \rho(x)$, we have $\rho'(\sigma(x)) = \sigma'(\rho(x))$, hence $x[\sigma] = \sigma(x) \sim_{\rho'} \sigma'(\rho(x)) = (\rho(x))[\sigma']$.

• Suppose $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)\sim_{\rho} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$ because $M_k \sim_{\rho_{k}} M'_k$ for all $k$, as in the definition. The $n_k$-ary fresh pushouts of each $i_k$ along $\sigma$ and each $i'_k$ along $\sigma'$ give us $n_k$-ary fresh extensions $j_k$ and $j'_k$ and functions $\tau_k$ and $\tau'_k$. Now by pasting of pushouts we can uniquely fill in the sequence of bijections $\rho'_k^l$ ending in $\rho'_k = \rho'_k^{n_k}$. By the inductive hypothesis, we have $M_k[\tau_k] \sim_{\rho'_k} M'_k[\tau'_k]$ for each $k$, which by definition is what we need to ensure $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)[\sigma] \sim_{\rho} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)[\sigma']$.

### Structural actions are functorial

Finally, we prove by induction on $M$ that whenever $\sigma' \circ \sigma = \rho \circ \sigma''$, with $\rho$ a bijection, we have $M[\sigma''] \sim_\rho M[\sigma][\sigma']$.

• Given $x$, we have $x[\sigma''] = \sigma''(x) \sim_\rho \rho(\sigma''(x)) = \sigma'(\sigma(x)) = x[\sigma][\sigma']$.

• Given $\mathtt{C}(i_1.M_1,\dots,i_m.M_m) \in Tm_E(s)$ for $E:V\to Sort$, for any $k$ we have the $n_k$-ary fresh pushout $(j_k,\tau_k)$ of $(i_k,\sigma)$, the $n_k$-ary fresh pushout $(j'_k,\tau'_k)$ of $(j_k,\sigma')$, and also the $n_k$-ary fresh pushout $(j''_k,\tau''_k)$ of $(i_k,\sigma'')$. By the universal properties of pushouts, the bijection $\rho$ extends to a unique sequence of bijections $\rho_k$ between the sequences $j'_k$ and $j''_k$ that commute with everything, which are also the unique comparison map between the $n_k$-ary fresh extensions $j'_k$ and $j''_k$ induced by $\rho$ as in the definition of $\sim$. Thus, the inductive hypothesis ensures $M_k[\tau''_k] \sim_{\rho_k} M_k'[\tau_k][\tau'_k]$ for all $k$, which is what we need to conclude $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)[\sigma''] \sim_\rho \mathtt{C}(i_1.M_1,\dots,i_m.M_m)[\sigma][\sigma']$.

### Conclusion: functoriality and congruence

It now follows that the relations $\sim_{id}$ are equivalence relations and congruences, and if we quotient the sets $Tm_E(s)$ by these equivalence relations, the actions $[\sigma]$ become functorial. However, we will not actually perform this quotienting at this stage; it seems to cause more trouble (see “general judgments” below) to do so prematurely here.

It does also follow that the heterogeneous $\alpha$-equivalence is reducible to the homogeneous one: we have $M \sim_\rho M'$ if and only if $M[\rho] \sim_{id} M'$.

## Substitution

Let $\sigma:U\to V$ be an injection of cocardinality 1 with $V\setminus \sigma(U) = \{x\}$, let $E:V\to Sort$ and $M\in Tm_E(s)$, and let $N\in Tm_{E\circ\sigma}(E(x))$. We define the substitution $M[N/\sigma] \in Tm_{E\circ \sigma}(s)$ by induction on $M$.

• If $M = x$, then $M[N/\sigma] = N$, which is well-typed since then $s = E(x)$.

• If $M = y \in V\setminus \{x\} = \sigma(U)$, then $y = \sigma(z)$ for a unique $z$, and we define $M[N/\sigma] = z$.

• If $M = \mathtt{C}(i_1.M_1,\dots,i_m.M_m)$, for each $k$ we have the $n_k$-ary fresh pushout complement of $\sigma$ and $i_k$ consisting of an $n_k$-ary fresh extension $j_k: U\to U_k$ and a function $\tau_k : U_k \to V_k$ of cocardinality 1. The signature of $\mathtt{C}$ extends $E$ to $E_k : V_k \to Sort$, which thereby restricts to $E_k \circ \tau_k : U_k \to Sort$ which extends $E\circ \sigma$ along $j_k$. Thus, we can apply the functorial structural rules to get $N[j_k] \in Tm_{E_k\circ \tau_k}(E_k(i_k(x)))$, while by definition we have $M_k \in Tm_{E_k}(s_k)$. So the inductive hypothesis gives us $M_k[N[j_k]/\tau_k] \in Tm_{E_k\circ \tau_k}(s_k)$, which we can then use to define $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)[N/\sigma]$ to be $\mathtt{C}(j_1.M_1[N[j_1]/\tau_1],\dots,j_m.M_m[N[j_m]/\tau_m])$.

Note how it is impossible for any variables to be “captured”, because the variables in $N$ get “renamed” (if necessary) by the weakenings $[j_k]$. Recall that in Examples and the fresh extensions carry the information of how the variables in the context have to appear with different names inside a binder; while in Examples and where the fresh extensions are subset inclusions, the bound variables are always distinct from those in the context. The categorical and strongly-typed framework frees us from having to think about which of these situations we are in and prevents us from accidentally making a wrong definition of substitution that would capture variables: $N$ itself does not have the correct variable-set and sort-context to be substituted into $M_k$; we must weaken it by $j_k$ first.

### Substitution preserves $\alpha$-equivalence

Suppose given a commutative square $\sigma'\circ \rho' = \rho\circ \sigma$, where $\sigma,\sigma'$ are injections of cocardinality 1 and $\rho,\rho'$ are bijections, with sort-contexts $E,E'$ with $E'\circ \rho = E$, terms $M \sim_\rho M'$, and $N\sim_{\rho'} N'$. We prove by induction (on $M \sim_\rho M'$) that $M[N/\sigma] \sim_{\rho'} M'[N'/\sigma']$.

• Note that if $V\setminus \sigma(U) = \{x\}$ and $V'\setminus \sigma'(U') = \{x'\}$, we must have $\rho(x) = x'$. Thus, in the case $M=x$ we must have $M'=x'$, hence $M[N/\sigma] = N$ and $M'[N'/\sigma] = N'$, so we are done by the assumption $N\sim_{\rho'} N'$.

• In the case of other variables $y\sim_\rho \rho(y)$, we have $y = \sigma(z)$ and $\rho(y) = \sigma'(\rho'(z))$, so we conclude by $z\sim_{\rho'} \rho'(z)$.

• Suppose $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)\sim_{\rho} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)$ because $M_k \sim_{\rho_{k}} M'_k$ for all $k$. Because pushout complements (of monomorphisms in an adhesive category) are unique up to unique isomorphism, the isomorphisms $\rho,\rho',\rho_k$ induce isomorphisms $\rho'_k$. The compatibility of structural actions with $\alpha$-equivalence implies that $N[j_k] \sim_{\rho'_k} N'[j'_k]$, so the inductive hypothesis yields $M_k[N[j_k]/\tau_k] \sim_{\rho'_k} M'_k[N'[j'_k]/\tau'_k]$, whence $\mathtt{C}(i_1.M_1,\dots,i_m.M_m)[N/\sigma] \sim_{\rho'} \mathtt{C}(i'_1,M'_1,\dots,i'_m,M'_m)[N'/\sigma']$ by definition.

## Dependent type theory

Let us specialize to the case of dependent type theory, with two sorts $tm$ and $ty$. Note that $tm$ is a sort, while $Tm_E(s)$ is the set of abstract binding trees of sort $s$; thus the “raw terms” of DTT are $Tm_E(tm)$ and the “raw types” are $Tm_E(ty)$. Since we do not generally hypothesize types, we consider only sort-contexts that assign the sort $tm$ to every variable; we henceforth omit to mention these sort-contexts. Thus we speak of raw terms with variables $V$ and raw types with variables $V$.

### Contexts

We define the raw contexts with variables $V$, for any $V\in \mathbb{F}$, inductively as follows.

• There is a unique raw context with variables $\emptyset$, the empty context $\lozenge$.

• If $\Gamma$ is a raw context with variables $V$ and $A$ is a raw type with variables $V$, and $i:V\to W$ is a fresh extension, then $\Gamma, i:A$ is a raw context with variables $W$.

Similarly, if $\rho : V\cong V'$ is a bijection and $\Gamma,\Gamma'$ are raw contexts with variables $V,V'$ respectively, we define $\Gamma \sim_\rho \Gamma'$ inductively:

• $\lozenge \sim_{id_\emptyset} \lozenge$

• If $\Gamma \sim_\rho \Gamma$ and $A\sim_\rho A'$, and $\rho':W\cong W'$ is the unique bijection induced by $\rho:V\cong V'$ and the fresh extensions $i:V\to W$ and $i':V'\to W'$, then $(\Gamma,i:A) \sim_{\rho'} (\Gamma',i':A')$.

Note that this “variable renaming” of contexts includes heterogeneous $\alpha$-equivalence (renaming bound variables) of the types occurring in the context. Thus, its homogeneous version $\Gamma \sim_{id} \Gamma'$ is exactly $\alpha$-equivalence of all the types.

### Judgments

The basic judgments of DTT are:

• For any raw context $\Gamma$ and raw type $A$, both with variables $V$, we have a judgment $\Gamma \vdash A\, type$.
• For any raw context $\Gamma$ and raw types $A$ and $B$, all with variables $V$, we have a judgment $\Gamma \vdash A \equiv B \, type$.
• For any raw context $\Gamma$, raw type $A$, and raw term $M$, all with variables $V$, we have a judgment $\Gamma \vdash M:A$.
• For any raw context $\Gamma$, raw type $A$, and raw terms $M$ and $N$, all with variables $V$, we have a judgment $\Gamma \vdash M \equiv N:A$.

Recall that we did not actually quotient our raw syntax by $\alpha$-equivalence. Thus there is no problem in stating a general judgment rule such as

$\frac{\Gamma, i:A \vdash M:B[i]}{\Gamma \vdash (\lambda i.M):A\to B.}$

Here $i$ denotes the same thing in both premise and conclusion: a fresh extension of the set of variables of $\Gamma$. We write $B[i]$ in the premise meaning the explicit weakening (structural action) of $B$ along $i$.

We deal with $\alpha$-equivalence by proving afterwards by induction that all our valid judgments are invariant under context renaming / heterogeneous $\alpha$-equivalence. For instance, the part of this result referring to the term rule is that whenever $\Gamma\sim_\rho \Gamma'$, $A\sim_\rho A'$ and $M\sim_\rho M'$, if $\Gamma \vdash M:A$ then also $\Gamma' \vdash M':A'$. And the corresponding inductive clause for the above $\lambda$-abstraction rule is:

• Suppose $\Gamma \vdash (\lambda i.M):A\to B$ is derived from $\Gamma, i:A \vdash M:B[i]$ by the above rule, and that we have $\Gamma\sim_\rho \Gamma'$, $(A\to B) \sim_\rho C$, and $(\lambda i.M)\sim_\rho N$. By inversion, we must have $C = (A'\to B')$ with $A\sim_\rho A'$ and $B\sim_\rho B'$, and similarly $N = (\lambda i'.M')$ with $M \sim_{\rho'} M'$, where $\rho':W\cong W'$ is the bijection induced by pushing $\rho$ forward along $i$ and $i'$. Then by definition we have $(\Gamma,i:A) \sim_{\rho'} (\Gamma',i':A')$, while $B[i] \sim_{\rho'} B'[i']$ by the above lemma that structural actions respect $\alpha$-equivalence. Therefore, the inductive hypothesis applies to show that $\Gamma', i':A' \vdash M':B'[i']$, hence $\Gamma' \vdash (\lambda i'.M'):A'\to B'$ by applying the rule again.

Perhaps one could formulate a general notion of “judgment” and “rule” that would allow proving this sort of lemma once and for all, rather than piece by piece about every rule in a particular theory, analogous to how abstract binding trees can deal simultaneously with all the binding constructions of all theories ($\Pi$, $\lambda$, $case$, $rec$, etc.).

## On uniqueness

I expect it should be possible to prove that the sets $Tm_E(s)$ and all their structure are independent, in a suitable sense, of the choice of $\mathbb{F}$. Probably the easiest way to do this would be to translate from an arbitrary $\mathbb{F}$ to the tautological example $\mathbb{F}=FinSet$ and show that this translation is an isomorphism (up to $\alpha$-equivalence).

## On practical notation

Our term syntax, for a general $\mathbb{F}$, is very similar to the way terms with binding are usually written. Of course there is the usual syntactic desugaring of $\Pi(x:A).B$ to $\Pi(A,x.B)$ and so on. Also, in some cases one traditionally binds a single variable (or, in our case, a fresh extension) in two or more subterms (such as a fully-annotated abstraction $\lambda(x:A.B).M$ where $x$ is bound in both $B$ and $M$), whereas we have to desugar this to copy the variable (e.g. $\lambda(A,x.B,x.M)$). These trivialities are common to other approaches to abstract binding as well.

More significantly, consider the three places that the same notion of “variable name” is used in traditional named syntax:

1. As terms (“uses” of variables).
2. At binding sites (including variable declarations in a context).
3. As things being substituted for.

In de Bruijn syntax, natural numbers are used in cases (1) and (3), while at (2) there is no “variable name” used at all. In our general framework, all three of these situations take different inputs: for (1) we give an element of a set $V$, for (2) we give a fresh extension of the ambient set $V$ of free variables, and for (3) we give an injection of cocardinality 1 into the current set $V$ of free variables.

We could coalesce (3) to (1) by equipping our $\mathbb{F}$s with a further operation: given $V\in \mathbb{F}$ and $x\in V$, a specified injection $\sigma_x:U\to V$ whose complement is $\{x\}$. In the named-variable Examples and $\sigma_x$ would be the subset inclusion of $V\setminus \{x\}$, while in the de Bruijn Examples and it would be the ordered injection $[n] \to [n+1]$ that omits $x$. This would probably work just as well; there seems no reason that substitution into one variable needs to be allowed to rename the other variables.

We could also make (1) and (2) appear to coalesce by declaring “the fresh name of a fresh extension” to be an implicit coercion. Then we could write a term like $\lambda x. x x$ where the first occurrence of $x$ denotes a fresh extension, while the latter two automatically coerce it to its fresh name.

However, there seems no way around the fact that our syntax requires all variables in the context to be explicitly (potentially) renamed every time they pass under a binder. Thus even with this coercion convention we cannot write $\lambda x. \lambda y. x$, since the fresh extension $y:V_2\to V_3$ might rename the fresh name of the fresh extension $x:V_1\to V_2$, and inside the binder $\lambda y$ we can only use elements of $V_3$ as terms. We have to instead write something like $\lambda x. \lambda y. x[y]$. This is still somewhat easier for a human to read than de Bruijn syntax, since the binders are referred to by name rather than by number, and it’s probably not unreasonable when doing formal theory (e.g. the explicit weakening $B[i]$ in the abstraction rule above) — for the Initiality Project I’m tempted to use this formalism explicitly, as the variable sets $V$ correspond fairly closely to the semantic interpretations I have in mind.

However, for actual practice in a type theory this syntax is more verbose than one would want, so one would probably choose a particular “implementation” — e.g. the Barendregt convention (Example ) for human readers, or de Bruijn indices (Example ) for a computer. The expected uniqueness theorem would enable translation between them as needed.

Last revised on November 7, 2018 at 15:03:59. See the history of this page for a list of all contributions to it.