natural deduction metalanguage, practical foundations
type theory (dependent, intensional, observational type theory, homotopy type theory)
computational trinitarianism = propositions as types +programs as proofs +relation type theory/category theory
One of the most important observations of category theory is that large parts of mathematics can be internalized in any category with sufficient structure. The most basic examples of this involve algebraic structures; for instance, a group can be defined in any category with finite products, and an internal category can be defined in any ambient category with pullbacks. For such algebraic (or even essentially algebraic) structures, which are defined by operations with equational axioms imposed, it suffices for the ambient category to have (usually finite) limits.
However, it turns out that if we assume some additional structure on the ambient category, then much more of mathematics can be internalized, potentially including fields, local rings, finite sets, topological spaces, even the field of real numbers. The idea is to exploit the fact that all mathematics can be written in the language of logic, and seek a way to internalize logic in a category with sufficient structure.
The basic ideas of the internal logic induced by a given category $C$ is this:
the objects $A$ of $C$ are regarded as collections of things of a given type $A$.
the morphisms $A\to B$ of $C$ are regarded as terms of type $B$ containing a free variable of type $A$ (i.e. in a context $x:A$).
a subobject $\phi \hookrightarrow A$ is regarded as a proposition (predicate): by thinking of it as the sub-collection of all those things of type $A$ for which the statement $\phi$ is true.
the maximal subobject is hence the proposition that is always true, this is the logical object of truth $\top \hookrightarrow A$.
the minimal subobject is hence the proposition that is always false, this is the logical object of falsity $\bottom \hookrightarrow A$.
one proposition implies another if as subobjects of $A$ they are connected by a morphism in the poset of subobjects: $\phi \Rightarrow \psi$ means $\phi \hookrightarrow \psi$.
Logical operations are implemented by universal constructions on subobjects.
the conjunction and is the product of subobjects (their meet).
the conjunction or is the coproduct of subobjects (their join).
and so on.
A dependent type over an object $A$ of $C$ may be interpreted as a morphism $B\to A$ whose “fibers” represent the types $B(x)$ for $x:A$. This morphism might be restricted to be a display map or a fibration.
Once we formalize the notion of “logical theory”, the construction of the internal logic can be interpreted as a functor $Lang$ from suitably structured categories to theories. The morphisms of theories are “interpretations”, and so an internalization of some theory $T$ (such as the “theory of groups”) into a category $C$ is a morphism of theories $T\to Lang(C)$.
Moreover, the functor $Lang$ has a left adjoint functor: the syntactic category $Syn$ of a theory. Thus, a model of $T$ in $C$ is equally well a functor $Syn(T)\to C$. Frequently, this adjunction is even an equivalence of categories; see relation between type theory and category theory.
There are many different kinds of of “logical theories”, each of which corresponds to a type of category in which such theories can be internalized (and yields a corresponding adjunction $Syn \dashv Lang$).
Theory | Category | |
---|---|---|
finite product theory | category with finite products | |
finite limit (aka “left exact” or “cartesian”) | finitely complete category | |
regular | regular category | |
coherent | coherent category | |
disjunctive | lextensive category (aka finitary disjunctive category) | |
geometric | infinitary coherent category (aka geometric category) | |
first-order | Heyting category | |
dependent types | locally cartesian closed category | |
higher order | elementary topos | |
linear logic | symmetric monoidal category |
Each type of logic up through “geometric” can also be described in terms of sketches. Not coincidentally, the corresponding types of category show up through “geometric” fit into the framework of familial regularity and exactness. Sketches can also describe theories applicable to categories not even having finite products, such as finite sum sketches, but the type-theoretic approach taken on this page requires at least finite products (or else something closely akin, such as a cartesian multicategory).
However, there are other sorts of internalization that do not fit in this framework. For instance, to describe a monoid internal to a monoidal category, one needs an internal linear logic. See internalization for a discussion of the more general notion in the context of doctrines.
We begin with the interpretation of internal first-order logic, which is the most used in toposes and related categories.
In this section, what we mean by a theory is a type theory without dependent types, but with a dependent logic. This entails the following.
The signature of the theory consists of
Various types $A,B,C$. For example, the theory of a group has only one type (group elements), but the theory of a-ring-and-a-module has two types (ring elements and module elements). There are also generally type constructors that build new types from basic ones, such as product types $A\times B$ and the unit type $1$.
The theory will also generally contain function symbols such as $f:A\to B$, each with a source and target that are types. For example, the theory of a monoid has one type $M$, one function symbol $m:M\times M\to M$, and one function symbol $e:1\to M$. Function symbols of source $1$ are also called constants.
The theory may also contain relation symbols $R:A$, each equipped with a type. For example, the theory of a poset has one type $P$ and one relation $\le:P\times P$. The most basic relation symbol, which most theories contain, is equality $=_A:A\times A$ on a type $A$.
Finally the theory may contain logical axioms of the form $\Gamma | \varphi \vdash \psi$. Here $\varphi$ and $\psi$ are first-order formulas built up from terms and relation symbols using logical connectives and quantifiers such as $\top,\bot,\wedge,\vee,\Rightarrow,\neg,\forall,\exists$, and $\Gamma$ is a context which declares the type of every variable occurring in $\varphi$ and $\psi$.
For example, the theory of a group has one type $G$, three function symbols $m:G\times G\to G$, $i:G\to G$, and $e:1\to G$, and axioms
This is an equational theory?, meaning that each axiom is just one or more equations between terms that must hold in a given context. For a different sort of example, the theory of a poset has one type $P$, one relation $\le:P\times P$, and axioms
Now suppose that we have a category $C$ with finite limits and we want to interpret such a theory internally in $C$. We identify the aspects of the theory with structures in the category by what is called categorical semantics:
First, for each type in the theory we choose an object of $C$. Then for each function symbol in the theory we choose a morphism in $C$. And finally, for each relation in the theory we choose a subobject in $C$. (We always interpret the relation of equality on a type $A$ by the diagonal $A\hookrightarrow A\times A$ in $C$.) Thus, for example, to interpret the theory of a group in $C$ we must choose an object $G$ and morphisms $m:G\times G\to G$, $i:G\to G$, and $e:1\to G$, while to interpret the theory of a poset, we must choose an object $P$ and a subobject $[\le] \hookrightarrow P\times P$.
Of course, this is not enough; we need to say somehow that the axioms are satisfied. We first define, inductively, an interpretation of every term that can be constructed from the theory by a morphism in $C$. For example, given an object $G$ and a morphism $m:G\times G\to G$, there are two evident morphisms $G\times G\times G \to G$ which are the interpretations of the two terms $m(m(x,y),z)$ and $m(x,m(y,z))$.
We then define, inductively, an interpretation of every logical formula? that can be constructed from the theory by a subobject in $C$. The idea is that if $x:A$ is a variable of type $A$ and $\varphi(x)$ is a formula with $x$ as its free variable, then the interpretation of $\varphi(x)$ should be the “subset” $\{x\in A | \varphi(x)\}$ of $A$. The base case of this induction is that if $t$ is a term interpreted by a morphism $A\to B$ and $R:B$ is a relation symbol, then $R(t)$ is interpreted by the pullback of the chosen subobject $R\hookrightarrow B$ representing $R$ along the morphism $t:A\to B$. The building blocks of logical formulas then correspond to operations on the posets $Sub(A)$ of subobjects in $C$, as follows.
Logical operator | Operation on $Sub(A)$ | |
---|---|---|
conjunction: $\wedge$ | intersection (pullback) | |
truth: $\top$ | top element ($A$ itself) | |
disjunction: $\vee$ | union | |
falsity: $\bot$ | bottom element (strict initial object) | |
implication: $\Rightarrow$ | Heyting implication | |
existential quantification: $\exists$ | left adjoint to pullback | |
universal quantification: $\forall$ | right adjoint to pullback |
The fact that existential and universal quantifiers can be interpreted as left and right adjoints to pullbacks was first realized by Bill Lawvere. One way to realize that it makes sense is to notice that in Set, the image of a subset $R\subset A$ under a function $f:A\to B$ can be defined as
while its “dual image” (the right adjoint to pullback) can be defined as
Of course, in not all finitely complete categories $C$ do all these operations on subobjects exist. Moreover, in order for the relationship with logic to be well-behaved, any of these operations we make use of must be stable under (preserved by) pullbacks. (Pullbacks of subobjects correspond to “innocuous” logical operations such as adding extra unused variables, duplicating variables, and so on, so they should definitely not affect the meaning of the logical connectives. However, in linear logic such operations become less innocuous.)
In any category with finite limits, the posets $Sub(A)$ always have finite intersections (given by pullback), including a top element (given by $A$ itself). Thus in any such category, we can interpret logical theories that use only the connectives $\wedge$ and $\top$. This includes both the theories of groups and posets considered above.
In a regular category, the existence of pullback-stable images implies that the base change functor $f^*:Sub(B)\to Sub(A)$ along any map $f:A\to B$ has a left adjoint, usually written $\exists_f$, and that these adjoints “commute with pullbacks” in an appropriate sense (given by the Beck-Chevalley condition. Thus, in a regular category we can interpret any theory in so-called regular logic, which uses only $\wedge$, $\top$, and $\exists$.
Actually, some instances of $\exists$ can be interpreted in any category with finite limits: if $f$ is itself a monomorphism, then $f^*$ always has a left adjoint simply given by composition with $f$. On the logical side, this means that we can interpret “provably unique existence” in any category with finite limits. Logic with $\wedge$, $\top$, and “provably unique existence” is called cartesian logic or finite-limit logic.
A coherent category is basically defined to be a regular category in which the subobject posets additionally have pullback-stable finite unions. Thus, in a coherent category we can interpret so-called coherent logic, which adds $\vee$ and $\bot$ to regular logic. Likewise, in an infinitary-coherent (or “geometric”) category we can interpret geometric logic, which adds infinitary disjunctions $\bigvee_i \varphi_i$ to coherent logic. Geometric logic is especially important because it is preserved by the inverse image parts of geometric morphisms, and because any geometric theory has a classifying topos.
On the other hand, in a lextensive category, we do not have images or all unions, but if we have two subobjects of $A$ which are disjoint (their intersection is initial), then their coproduct is also their union in $Sub(A)$. Therefore, in a lextensive category we can interpret disjunctive logic, which is cartesian logic plus $\bot$ and “provably disjoint disjunction.” Likewise, in an infinitary-lextensive category we can interpret “infinitary-disjunctive logic.”
Finally, in a Heyting category the base change functors $f^*:Sub(B)\to Sub(A)$ also have right adjoints, usually written $\forall_f$, and it is easy to see that this implies that each $Sub(A)$ is also a Heyting algebra, hence has an “implication” $\Rightarrow$ as well. (We define “negation” by $\neg \varphi \equiv \varphi \Rightarrow \bot$.) Thus, in a Heyting category we can interpret all of (finitary, first-order) intuitionistic logic.
Now that we know how to interpret logic, we can say that a model of a given theory in $C$ consists of a choice of objects, morphisms, and subobjects for the types, function symbols, and relation symbols as above, such that for each axiom $\Gamma | \varphi \vdash \psi$, we have $[\varphi]\le [\psi]$ in $Sub([\Gamma])$. Here, $[\Gamma]$ is the product of the objects that correspond to the types of the variables in $\Gamma$, $[\varphi]$ and $[\psi]$ are the interpretations of the formulas $\varphi$ and $\psi$ as subobjects of $[\Gamma]$, and $\leq$ is the relation of subobject inclusion.
It is easy to verify that a model of the theory of a group in $C$ is precisely an internal group object in $C$, as usually defined. For instance, the validity of the axiom
means that the equalizer of the two morphisms $G\times G\times G \to G$ must be all of $G\times G\times G$, or equivalently that those two morphisms must be equal. The same happens in most other cases.
As described above, a model of a given theory $T$ in a category $C$ consists of an assignment
types of $T$ | $\to$ | objects of $C$ |
function symbols of $T$ | $\to$ | morphisms of $C$ |
relation symbols of $T$ | $\to$ | subobjects in $C$ |
axioms of $T$ | $\to$ | containments in $C$ |
This is a sort of heteromorphism in that it changes the name of things as it operates on them. We can describe it more simply as a “translation of theories” as follows.
Given a category $C$ (which may be regular, coherent, geometric, Heyting, etc.), we define its internal type theory (with first-order logic) $Lang(C)$ to be the theory whose
Now a model of $T$ in $C$ can be described simply as a morphism of theories (a “translation” or “interpretation”)
The functor $Lang : Categories \to Theories$ has a left adjoint, the syntactic category of a theory. Thus we have a chain of natural isomorphisms
Internal logic is not just a way to concisely describe internal structures in a category, but also gives us a way to prove things about them by “internal reasoning.” We simply need to verify that the “usual” methods of logical reasoning (for example, from $\varphi\vdash \psi$ and $\psi\vdash \chi$ deduce $\varphi\vdash\chi$) are internally valid, in the sense that if the premises are satisfied in some model $C$ (in the example, if $[\varphi]\le [\psi]$ and $[\psi]\le [\chi]$) then so is the conclusion (in the example, $[\varphi]\le [\chi]$). This is called the Soundness Theorem.
It then follows that if we start from the axioms of a theory and “reason normally” within type theory, which in practice amounts to pretending that the types are sets, the function symbols are functions, and the relation symbols are subsets, then anything we prove will still be true when the theory is interpreted in an arbitrary category, not just Set. For example, by easy equational reasoning from the theory of a group, we can prove that inverses are unique, which is expressed by the logical sequent
It follows that this is also true, suitably interpreted, as a statement about internal group objects in any category.
There are (at least) three caveats. Firstly, we must take care to use only the rules appropriate to the fragment of logic that is valid in the particular categories we are interested in. For example, if we want our conclusions to be valid in any regular category, we must restrict ourselves to reasoning “within regular logic.” Most mathematicians are not familiar with making such distinctions in their reasoning, but in practice most things one would want to say about a regular theory turn out to be provable in regular logic. (We will not spell out the details of what this means.) And once we are in a Heyting category, and in particular in a topos, this problem goes away and we can use full first-order logic.
The second, more important, caveat is that the internal logic of all these categories is, in general, constructive. This means that, among other things, the interpretation of $\neg\neg\varphi$ is, in general, distinct from that of $\varphi$, and that $\varphi\vee \neg\varphi$ is not always valid. So even if we believe that classical logic (including the principle of excluded middle and even the axiom of choice) is “true,” as many mathematicians do, there is still a reason to look for proofs that are constructively acceptable, since it is only these which are valid in the internal logic of most categories. If the category is Boolean and/or satisfies the internal axiom of choice, however, then this problem goes away, but these fail in many categories in which one wants to internalize (such as many Grothendieck toposes).
The third caveat is that one must take care to distinguish the internal logic of a category from what is externally true about it. In general, internal validity is “local” truth, meaning things which become true “after passing to a cover.” This is particularly important for formulas involving disjunction and existence. For example, an object’s being projective in the category $C$ is a different statement from its being internally projective, meaning that “$X$ is projective” is true in the internal logic. Another good example can be found in the different notions of finite object in a topos. This problem goes away if the ambient category is well-pointed, but well-pointed categories are even rarer than Boolean ones satisfying choice; the only well-pointed Grothendieck topos is Set itself.
The converse of the Soundness Theorem is called the Completeness Theorem, and states that if a sequent $\varphi\vdash\psi$ is valid in every model of a theory, then it is provable from that theory. This is noticeably less trivial. In classical first-order logic, where the only models considered are set-valued ones, the completeness theorem is usually proven using ultraproducts. However, in categorical logic there is a more elegant approach (which additionally no longer depends on any form of the axiom of choice).
The syntactic category $C_T = Syn(T)$ of a theory $T$ was mentioned above, as the left adjoint to the “internal logic functor” $Lang$. By the Yoneda lemma, the syntactic category $C_T$ contains a “generic” model of the theory. Moreover, by the construction of $C_T$ (see syntactic category), the valid sequents in this model are precisely those provable from the theory. Therefore, if a sequent is valid in all models, it is in particular valid in the generic model in $C_T$, and hence provable from $T$.
The universal property of $C_T$ is also sometimes useful for semantic conclusions. For instance, sometimes one can prove something about the generic model and then carry it over to all models.
Furthermore, if $T$ lives in a sub-fragment of geometric logic (such as regular, coherent, lextensive, or geometric logic), then the Grothendieck topos of sheaves on $C_T$ for its appropriate (regular, coherent, extensive, or geometric) coverage contains a $T$-model which is generic for models in Grothendieck toposes: any $T$-model in a Grothendieck topos is its image under the inverse image of a unique geometric morphism. This topos is called the classifying topos of the theory.
The syntactic category of a theory can be considered as the “extensional essence” of that theory, since functors out of $C_T$ completely determine the $T$-models in any category $D$ with suitable structure. It therefore makes sense, in some contexts, to define a morphism of theories to be a functor between their syntactic categories, and an equivalence of theories (sometimes called a Morita equivalence) to be an equivalence between their syntactic categories.
A morphism $T\to T'$ between theories, in this sense, induces a functor from $T'$-models in $D$ to $T$-models in $D$, for any category $D$ with suitable structure, in a way which is natural in $D$. In particular, theories which are “Morita equivalent” in this sense have naturally equivalent categories of models in all categories $D$ with suitable structure; so they have the same “meaning” even though they may be presented quite differently. (Note that this is a much stronger sort of equivalence than merely having equivalent categories of models in some particular category, such as $Set$.) Moreover, the fact that the syntactic category is defined “syntactically” means that a morphism $T\to T'$ actually induces a “translation” of the types, functions, and relations of $T$ into those of $T'$.
By first applying various “completion” processes to syntactic categories before asking about equivalence, we obtain coarser notions of equivalence, which only induce equivalences of models in more restricted sorts of categories. For instance, if we compare the exact completions of syntactic categories of regular theories, we obtain a notion of equivalence that induces equivalences of categories of models in all exact categories (not necessarily all regular ones). Likewise for coherent theories and pretoposes, and for geometric theories and infinitary pretoposes. Note, though, that the infinitary-pretopos completion of a (small) geometric theory is in fact already a (Grothendieck) topos, and coincides with the classifying topos considered above. Thus, passage to classifying toposes is also an instance of this construction, and an equivalence of classifying toposes means that two theories have equivalent categories of models in all toposes. (This is still much stronger than just having equivalent categories of models in $Set$.)
To be written, but see Kripke-Joyal semantics.
We now consider the internal language of a locally cartesian closed category as a dependent type theory.
Material to be moved here from relation between type theory and category theory.
To be written, but see Mitchell–Bénabou language for the version in a topos.
The topos Set in classical mathematics of course has as its internal logic the “ordinary” logic. This is reproduced by following the abstract nonsense as follows:
the terminal object of Set is the one-element set ${*}$, the subobject classifier in Set is the two-element set $\Omega = \{true, false\}$ equipped with the map
that picks the element $true$ in $\Omega$. The Heyting algebra of subobjects of the terminal object is the poset
consisting only of the two trivial subobjects of $*$, the point itself and the empty set, and the unique inclusion morphism between them. These are classified, respectively, by the truth values ${*} \stackrel{false}{\to} \Omega$ and ${*} \stackrel{true}{\to} \Omega$, so that we can also write our poset of subobjects of the terminal object as
The logical operation $\wedge = AND$ is the product in the poset $L$. Indeed we find pullback diagrams in $L$
The logical operation $\vee = OR$ is the coproduct in the poset $L$. Indeed we find pushout diagrams in $L$
The logical operation $\not = NOT$ is given by the internal hom into the initial object in $L$:
We find the value of the internal hom by its defining adjunction. For $hom(true,false)$ we have
and
from which we deduce that
Similarly for $hom(false,false)$ we have
and
from which we deduce that
This way all the familiar logical operations are recovered from the internal logic of the topos Set.
Let $X$ be a topological space and $Op(X)$ its category of open subsets and $Sh(X) := Sh(Op(X))$ the Grothendieck topos of of sheaves on $X$.
We discuss the internal logic of this sheaf topos (originally Tarski, 1938).
The terminal object is the sheaf represented by $X$: the one that is constant on the one-element set
The subobjects of this object are the representable presheaves
for $V \in Op(X)$.
In the presheaf topos $PSh(Op(X))= Func(Op(X)^{op},Set)$, the subobjects of $1$ are arbitrary sieves in $Op(X)$, not just representables. For instance, for any two open sets $U$ and $V$ there is a sieve consisting of all open sets contained in either $U$ or $V$, which doesn’t necessarily contain $U\cup V$. It’s only in the sheaf topos $Sh(X)$ that the representables are precisely the subobjects of $1$.
The poset of subobjects formed by these is just the category of open subsets itself:
The logical operation $AND$ is the product in $Op(X)$: this is the intersection of open subsets.
The logical operation $OR$ is the coproduct in $Op(X)$: this is the union of open subsets.
The internal hom in $Op(X)$ is given by
(the interior of the union of the complement of $U$ with V).
So negation is given by sending an open subset to the interior of its complement:
In particular we find that in the internal logic of $PSh(X)$ the law of the excluded middle fails in general, as in general we do not have that
because $\not U \vee U = (U^c)^\circ \cup U = X \backslash \partial U$ is the total space $X$ without the boundary (frontier) of $U$, and not $true = X$, all of the total space.
Thus, the internal logic of this sheaf topos is (in general) intuitionistic logic. As remarked above, this is the case in many toposes.
Most books on topos theory develop some internal logic, at least in the context of a topos. For example:
Saunders Mac Lane Ieke Moerdijk, Sheaves in Geometry and Logic
Goldblatt, Topoi: the categorial analysis of logic
Part D of
is comprehensive.
The book
works in the even more general context of fibrations, allowing us to associate to each object $A$ an arbitrary poset instead of $Sub(A)$.
The book
is arguably all about this subject (although you wouldn't know it until about Chapter VIII), but from a different perspective. In particular, Taylor allows us to replace having all pullbacks with pullbacks along a pullback-stable class of display morphisms.
A discussion of dependent type theory as the internal language of locally cartesian closed categories is in
The observation that the poset of open subsets of a topological space serve as a model for intuitionistic logic is apparently originally due to