nLab Initiality Project - Overview

Initiality Project - Overview

Initiality Project - Overview

This page is part of the Initiality Project.

Here is a proposed slate of choices to make regarding type theories, presentations, definitions, and proof structure. (“I” refers to Mike Shulman.) This is only a first proposal and may change; discussion is underway at the nForum here.

The semantics

Of the various ways to describe a categorical model of dependent types, I suggest to use categories with families, presented in the style called by Awodey a natural model, but fully algebraically. That is, for us a category with families is a (small) category CC together with two presheaves Tm,Ty[C op,Set]Tm,Ty \in [C^{op},Set] and a morphism TmTyTm \to Ty that is algebraically representable (or, one might say, “represented”): it is equipped with a function assigning to each morphism A:yΓTyA:y \Gamma \to Ty, where ΓC\Gamma\in C and yy denotes the Yoneda embedding, an object Γ.AC\Gamma.A\in C and a pullback square

y(Γ.A) Tm yΓ A Ty.\array{ y(\Gamma.A) & \to & Tm \\ \downarrow &&\downarrow \\ y\Gamma & \xrightarrow{A} & Ty. }

This is equivalent to the usual notion of category with families (and thereby also to the usual notion of “category with attributes”, hence also to discrete or full split comprehension categories). But, as noted by Awodey, focusing attention on TmTm and TyTy as objects of [C op,Set][C^{op},Set] has the advantage that the additional structure corresponding to the rules of a type theory are also naturally formulated in [C op,Set][C^{op},Set]. For instance, if Ty 2Ty^2 denotes the domain of the local exponential (Ty×TyTy) (TmTy)(Ty\times Ty \to Ty)^{(Tm\to Ty)}, then the rules of Σ\Sigma-formation and Π\Pi-formation correspond semantically to giving a morphism Ty 2TyTy^2 \to Ty.

Note also that Ty 2Ty^2 can be described in the internal extensional type theory of [C op,Set][C^{op},Set] by the context (A:Ty,B:Tm(A)Ty)(A:Ty, B:Tm(A) \to Ty); thus [C op,Set][C^{op},Set] is a sort of “semantic logical framework” for structuring CC. My inclination right now is that while we can rely on this for intuition and explanation, we should also formulate all this structure precisely in traditional category-theoretic language, to avoid any appearance of circularity in our construction of categorical models of dependent type theory.

A morphism of categories with families is a functor F:CDF:C\to D together with a commutative square in [C op,Set][C^{op},Set]:

Tm C F *(Tm D) Ty C F *(Ty D)\array{ Tm_C & \to & F^*(Tm_D) \\ \downarrow & & \downarrow \\ Ty_C & \to & F^*(Ty_D) }

which strictly respects the chosen pullback squares in an appropriate way. This defines the category CwFCwF in which we hope to construct an initial object.

The type theory

I suggest to start with a type theory containing only Π\Pi-types. This seems the simplest situation where we will have to deal with all the basic issues. We can then add additional structure as desired.

Axioms

Note that this type theory is completely empty unless we also assert some axioms. Ideally, I would like the “initiality theorem” to really be a “freeness theorem”, saying that the syntactic category of a type theory with axioms is freely generated by those axioms. Of course, particular axioms can just be added as rules to a theory, but such a general freeness theorem should be parametric over all possible axioms.

Since in a dependent type theory, each axiom can involve previous axioms in its type, it seems that the general semantic formulation will have to involve some kind of a “cell complex” in the category CwFCwF. However, we can leave this to the future for the moment and concentrate on the inductive clauses for interpreting the basic rules of Π\Pi-types.

Raw syntax

I propose to use syntax with named variables, quotiented by α\alpha-equivalence. My reason is that our goal is expositional and sociological, and named variables keep the syntax as close as possible to what “users” of type theory are familiar with. Since we are humans writing for humans to read, I don’t want to deal explicitly with de Bruijn indices, and neither do I want to pretend that we are using de Bruijn indices when we aren’t really using them.

If this proposal is adopted, then the raw terms for our simple type theory will be inductively generated by:

  • Every variable xx is a term.
  • If A,BA,B are terms and xx is a variable, then Π(x:A)B\Pi(x:A) B is a term.
  • If A,B,MA,B,M are terms and xx is a variable, then λ(x:A.B)M\lambda(x:A.B)M is a term.
  • If A,B,M,NA,B,M,N are terms and xx is a variable, then App (x:A)B(M,N)App^{(x:A)B}(M,N) is a term.

We define α\alpha-equivalence which renames bound variables as usual. Note that the variable xx in Π(x:A)B\Pi(x:A)B scopes only over BB, in λ(x:A.B)M\lambda(x:A.B)M it scopes only over BB and MM, and in App (x:A)B(M,N)App^{(x:A)B}(M,N) it scopes only over BB. Capture-avoiding substitution is likewise defined as usual. Our partial interpretation will be defined recursively over the above inductive definition of terms, and proven to respect α\alpha-equivalence.

Bidirectionality

Following Streicher, all our terms are fully (or at least “sufficiently”) annotated. For instance, we write App (x:A)B(M,N)App^{(x:A)B}(M,N) rather than just App(M,N)App(M,N). (It is, at best, unclear whether any less than fully annotated syntax can be used in an initiality theorem without preprocessing that is tantamount to annotating it.)

In particular, this means that given Γ\Gamma and tt, if Γt:A\Gamma \vdash t:A can be derived for any value of AA, then there is a canonical such AA that can be deduced syntactically, or synthesized, from Γ\Gamma and tt. For instance, if App (x:A)B(M,N)App^{(x:A)B}(M,N) has any type, then it must have the type B[N/x]B[N/x]. On the other hand, the conversion rule implies that App (x:A)B(M,N)App^{(x:A)B}(M,N) must also have any type that is judgmentally equal to B[N/x]B[N/x].

Type theorists have a standard technique for distinguishing these two situations, known as bidirectional typechecking. Instead of one judgment Γt:A\Gamma\vdash t:A, we have two typing judgments:

  • ΓtA\Gamma \vdash t \Rightarrow A: in context Γ\Gamma the term tt synthesizes the type AA. Here AA is, if it exists, uniquely determined by Γ\Gamma and tt: it is an “output” to their “inputs”.
  • ΓtA\Gamma \vdash t \Leftarrow A: in context Γ\Gamma the term tt checks against the type AA. Here Γ\Gamma, tt, and AA are all “inputs” and the only “output” is the truth value of whether the typecheck is valid.

There are various advantages of bidirectional typechecking, including:

  1. In at least some cases, it permits a formulation of type theory in which only normal forms exist.
  2. In at least some cases, it allows omission of annotations, e.g. an un-annotated abstraction λx.M\lambda x.M cannot synthesize a type, but it can check against a Π\Pi-type Π(x:A)B\Pi(x:A) B.
  3. In at least some cases, it makes it easier to incorporate η-expansion into an equality-checking algorithm.
  4. In at least some cases, it makes derivations of judgments unique, or at least canonical.
  5. In at least some cases, it isolates the use of judgmental equality during typechecking in one particular “mode-switching” rule:
    ΓtAΓABΓtB \frac{\Gamma \vdash t \Rightarrow A \qquad \Gamma \vdash A\equiv B}{\Gamma \vdash t \Leftarrow B}

(Experts, feel free to add more.) It is unclear at present whether any of these advantages are relevant to categorical semantics, but by using a bidirectional framework we are better placed to take advantage of them if they do become relevant. More importantly, it seems to me that the natural structure of the semantic interpretation, as suggested by Peter Lumsdaine (see below), matches the syntactic bidirectional picture quite closely, and it may be clarifying to make that analogy more precise.

Since all our terms are fully annotated, all of our formation, introduction, and elimination rules can synthesize their types. However, their premises usually involve checking their subterms against appropriate types, so the above mode-switching rule gets applied a lot (it is the only way, in our system, to derive a checking judgment). This might be an issue to want to optimize away in an implementation, but for our semantic purposes it seems irrelevant.

For instance, here are the formation, introduction, and elimination rules for Π\Pi-types:

ΓAtypeΓ,x:ABtypeΓΠ(x:A)Btype \frac{\Gamma \vdash A \, type \qquad \Gamma, x:A \vdash B\,type}{\Gamma \vdash \Pi(x:A) B \, type}
ΓAtypeΓ,x:ABtypeΓMΠ(x:A)BΓNAΓApp (x:A)B(M,N)B[N/x] \frac{\Gamma \vdash A \, type \qquad \Gamma, x:A \vdash B\,type \qquad \Gamma \vdash M \Leftarrow \Pi(x:A) B \qquad \Gamma \vdash N \Leftarrow A}{\Gamma \vdash App^{(x:A)B}(M,N) \Rightarrow B[N/x]}
ΓAtypeΓ,x:ABtypeΓ,x:AMBΓλ(x:A.B)MΠ(x:A)B \frac{\Gamma \vdash A \, type \qquad \Gamma,x:A \vdash B \,type \qquad \Gamma,x:A \vdash M \Leftarrow B}{\Gamma \vdash \lambda(x:A.B)M \Rightarrow \Pi(x:A) B}

(Note that the “is a type” judgment ΓAtype\Gamma \vdash A\,type is not bidirectional. With Russell universes we could consider making it so.)

The “hypothesis” rule is that a variable in the context synthesizes its assumed type:

(x:A)ΓΓxA.\frac{(x:A) \in \Gamma}{\Gamma \vdash x \Rightarrow A.}

In implementations of bidirectional typechecking, the equality judgments Γst:A\Gamma \vdash s\equiv t : A and ΓABtype\Gamma \vdash A \equiv B \, type are sometimes made bidirectional as well. But at the moment, I don’t think there will be is any benefit to us in doing likewise. In particular, we would like our setup to be general enough to encompass type theories whose judgmental equality is “poorly behaved”, e.g. undecidable, such as those with an equality reflection rule. Thus the equality judgments should have primitive rules asserting reflexivity, symmetry, transitivity, and congruence properties.

Streicher also includes a basic judgment for equality of contexts. Peter suggests that that isn’t necessary, because

any contexts that would be judgementally equal in the system with that judgement still end up canonically end up judgementally isomorphic, by the substitution that consists of not renaming variables.

I’m inclined to follow Peter’s suggestion here, but am open to counterarguments.

Streicher also includes a number of explicit primitive rules involving things like validity of contexts and type-preservation by substitution. I’m not sure whether these are necessary or not; it seems that maybe by presenting our type theories in “the usual way” we should always be able to ensure that substitution is admissible, even for type theories that are “ill-behaved” in other ways (such as equality reflection). But maybe there are good reasons to include these rules explicitly?

The proof

The following proposal for the inductive structure of the argument is based on Streicher’s original approach, with a modification suggested by Peter Lumsdaine, and further tweaks to incorporate bidirectionality and build in naturality using the natural-models technology.

The partial interpretation

Given a finite set of variables V={x 1,,x n}VarV = \{x_1,\dots,x_n\} \subseteq Var, the presheaf of environments for VV is Tm VTm^V. We will define, by structural induction on any term tt, and any any finite set VV of variables, the following partial morphisms of presheaves.

  1. t type V:Tm VTy\llbracket t \rrbracket ^V_{type} : Tm^V \rightharpoonup Ty.
  2. t V:Tm VTm\llbracket t \rrbracket ^V_{\Rightarrow} : Tm^V \rightharpoonup Tm.
  3. t V:Tm V×TyTm\llbracket t \rrbracket ^V_{\Leftarrow} : Tm^V \times Ty \rightharpoonup Tm such that the composite Tm V×TyTmTyTm^V \times Ty \rightharpoonup Tm \to Ty is the second projection.

In fact, the third of these is not actually inductive at all: it is simply a semantic counterpart of the bidirectional mode-switching judgment. We define t V\llbracket t \rrbracket ^V_{\Leftarrow} to be the domain restriction of

Tm V×Typr 1Tm Vt VTm Tm^V \times Ty \xrightarrow{pr_1} Tm^V \overset{\llbracket t \rrbracket ^V_{\Rightarrow}}{\rightharpoonup} Tm

to the equalizer of

Tm V×Typr 1Tm Vt VTmTy Tm^V \times Ty \xrightarrow{pr_1} Tm^V \overset{\llbracket t \rrbracket ^V_{\Rightarrow}}{\rightharpoonup} Tm \to Ty

and the restriction of the second projection Tm V×TyTm^V \times Ty to its domain.

The other inductive clauses of the partial interpretation should precisely mirror the bidirectional rules of our type theory, using recursive calls to the above three operations as counterparts of the three syntactic judgments ΓAtype\Gamma \vdash A\,type, ΓtA\Gamma \vdash t\Rightarrow A, and ΓtA\Gamma \vdash t\Leftarrow A. Note, though, that at this point this is only an analogy: we are simply inducting over the raw term structure of tt, not making any actual reference to these judgments.

This suggestion incorporates the innovation proposed by Peter over Streicher’s proof of taking the interpretation of the context Γ\Gamma as an input to the partial intepretation functions rather than an output (see in particular his point (7)). The elements of the presheaf Tm VTm^V at some object XCX\in C are what Peter calls “environments” for tt. Note that we do not require that VV contain all free variables occurring in tt, but if it omits some then the partial interpretation morphisms should turn out to have empty domain.

I have also reformulated the interpretation function to “put naturality into the goals of the original induction” as suggested by Peter in his point (8), by directly defining partial morphisms in the “semantic logical framework” [C op,Set][C^{op},Set]. This has the additional advantage that in writing down the inductive clauses we can refer directly to the natural-models formulation of the corresponding structure on CC, as sketched above.

Totality of the interpretation

Although the partial interpretation functions sketched above incorporate a number of “tweaks” to make it more pleasing, in fact it was already the short part: Streicher does it in only 5 pages. (His pages are very short, so this isn’t actually very much, plus his type theory is more complicated.) The long part is proving that these interpretation functions are total on derivable judgments; this takes Streicher 40 pages.

The outline should be the same as in Streicher’s proof and Peter’s proposal:

  1. Prove by induction on raw terms that the partial interpretation functions take syntactic substitution (including weakening) to semantic substitution (restriction in the presheaves TmTm and TyTy). This takes Streicher 17 pages.

  2. Prove by induction on derivations that the partial interpretation functions are total on derivable judgments. This takes Streicher 20 pages.

Loose ends

To complete the proof of initiality after this, it remains to:

  1. Construct the term model. This takes Streicher 12 pages.

  2. Show that the above interpretation assembles into a morphism in CwFCwF from the term model to CC. As far as I can tell, Streicher doesn’t actually do this, he just claims that it can be done.

  3. Show that this morphism is unique. Streicher doesn’t actually do this either.

Last revised on January 7, 2019 at 20:49:45. See the history of this page for a list of all contributions to it.