There are many students and researchers outside of category theory who have heard about categorical approaches to topics in mathematical foundations, and are eager to learn more. They may have heard for instance about “categorical set theory” such as Lawvere’s ETCS (the Elementary Theory of the Category of Sets), a theory about certain types of toposes. And they may have heard that topos theory itself can be viewed as a “generalized set theory”, different in kind from ordinary set theory in that (1) it is “structural”: its basic concepts are invariant with respect to isomorphism (so that what sets are “made of” exactly is not of basic importance; what is important is how sets are related to each other through functions and relations), and (2) topos theory is more flexible and general than ordinary set theory; indeed there is a multiverse of toposes, set-like universes which provide native environments for expressing and concentrating and simplifying particular forms of mathematical practice, including forms familiar from set theory like ultrapowers, forcing extensions, Boolean-valued models, etc., but also “petit” toposes arising from spaces, or classifying toposes as arising in algebraic geometry, toposes for synthetic differential geometry, realizability toposes for recursive function theory, … This is exciting, and consequently there is a demand and market for introductory texts on topos theory.
However, such readers may be dismayed by how technically demanding the standard introductory accounts of topos theory turn out to be. A student who turns for example to the (deservedly) well regarded graduate text Sheaves in Geometry and Logic by Mac Lane and Moerdijk, to learn about “First properties of elementary topoi” (Chapter IV), may be surprised by how much categorical background is actually required. For example, even to construct very basic things like finite coproducts and quotients, the reader is expected to become familiar with properties of monadic functors, the crude monadicity theorem, and Beck-Chevalley conditions, none of which is likely to be familiar to those outside category theory, and certainly not the type of thing one usually associates with “set theory”. One could understandably walk away believing that topos theory is something only for specialists, nothing at all like set theory but some other abstruse categorical development. And to set theorists who were told that topos theory and various offshoots offer an alternative and viable categorical approach to set-theoretic foundations, this situation may seem totally out of hand, even ridiculous.^{1}
Perhaps even for some category theorists, the connection between such standard techniques and the naive set-theoretic principles we grew up with is less than straightforward. Coproducts are basically just disjoint unions, right? Do we need recondite and indirect arguments to prove their existence in an elementary topos? The standard approach (based on elegant but slightly tricky arguments first published by Paré;) is to show that the contravariant power (i.e., power set, power object) functor on a topos is monadic. The monad itself is the double power functor, and the monadicity theorem requires consideration of two or three iterations of the monad, so prima facie up to six applications of the power functor are involved in the construction of finite colimits. (In ordinary mathematics it’s somewhat rare to go beyond three.) Even if one is convinced that the construction works, it still doesn’t give a very “common sense” way to view coproducts or coequalizers. A result is that one might wind up treating colimits in an elementary topos as sitting somewhere in a black box: undeniably there, but at a remove from clear intuitive view.^{2} In any case, one has to work to make this approach seem natural or “obvious”: an uncomfortable situation, particularly in a classroom setting.
My PhD thesis adviser was Myles Tierney; he and Bill Lawvere were the founders of elementary topos theory. Myles and I would sometimes talk about these things, where he would express longing for a “naive topos theory” just as viable and available for use by working mathematicians as ordinary naive set theory. On one occasion – either in his office or during a short elevator ride to his office in Hill Center – he outlined for me a program for obtaining all the exactness properties of a topos (i.e., the finitary hypotheses in Giraud’s theorem characterizing Grothendieck toposes), in a manner more elementary and natively set-theoretic than the usual accounts I knew from graduate textbooks. This conversation, which occurred decades ago, has been at the back of my mind ever since.
The goal of these notes is to try to reproduce and fill in details of Myles’s brief outline to me. (It was, in my possibly faulty memory, very brief indeed – it took him perhaps as few as 90 seconds to get it across.) I don’t know if the approach I will take here is exactly what he had in mind, or whether it will have more appeal for many who have struggled with the standard treatments. But at the very least, I hope to prove existence of colimits in a topos (as well as all the other basic properties) in a way closer to common sense reasoning, eschewing technical monadicity theorems and such, and using formulas and constructions that appeal directly to naive set theory intuitions.
These notes will nevertheless give a self-contained account of all of the standard categorical properties of an elementary topos, summarized by saying that a topos is a locally cartesian closed Heyting pretopos (in particular an effective regular extensive category with finite colimits, covering all those conditions of Giraud’s theorem that are “elementary”). Naturally, while my goal has been to keep this within a manageable number of pages and stick to common sense intuitions, I must make robust use of the language of categorical algebra to carry out this program.
A central result of elementary topos theory, and certainly of this account, is the slice theorem: if $\mathbf{E}$ is a topos, then so is the slice category $\mathbf{E}/X$ over any object $X$. (Freyd and Scedrov dub this “the fundamental theorem of topos theory”.) These notes have been arranged so that, save a few preliminary observations, almost everything flows from this one result. However, the proof might look a bit hard-core, because I couldn’t resist the urge to prove a more powerful but unquestionably even more technical-looking result: that the category of coalgebras of a pullback-preserving comonad on a topos is again a topos. I justify this decision on the grounds that it covers in one fell swoop both the slice theorem and the theorem about left exact comonads, and is probably not as well-known as it might be (it’s not in Mac Lane and Moerdijk for instance; the only textbook source I know of is Johnstone’s “Elephant”). But I also give this result for personal reasons: it was something else Myles and I had fun working out (again during my grad school days, sometime around 1987 perhaps), before we learned it was already known at least to a few people.
The reader can skip over the proof if she likes, or refer back to it as desired. The proof itself is in my opinion not so important as the implications. The main mathematical consequence is the local cartesian closure of toposes and more particularly the stability of colimits under pullback, which is enormously important. More philosophically, the moral of the slice theorem is it provides a theoretical basis, a methodological bridge, for crossing from naive set theory over to topos theory, and particularly for developing the internal logic of a topos.
Generally, the method starts with writing down set theory propositions in terms of ordinary elements (formulated categorically, as points $x: 1 \to X$). Now, the slice theorem guarantees that whatever is provable for a general topos $\mathbf{E} \simeq \mathbf{E}/1$ is also provable for a slice $\mathbf{E}/U$, and this allows us to relativize statements and constructions: statements which were formulated in terms of ordinary points $1 \to X$, may now be restated and proven for more general “points” $U \to X$ over arbitrary domains $U$. Finally, by instantiating at the “generic point” $1_X: X \to X$, a maneuver reminiscent of the Yoneda lemma, one arrives at the “right” corresponding construction or proposition for topos theory, one that is stable and internally correct.
With these points in mind, we may begin our account proper, starting with the systematic construction of the internal logic.^{3} At first, the “logic” is quite parsimonious: it’s just equational logic, or logic expressible in terms of products and equalizers. Bit by bit one constructs internal logical operators in a topos, starting with the “negative” operators $\wedge$, $\Rightarrow$, and $\forall$. This gives us enough to write down a formula for internal intersection $\bigcap$, a crucial step. With this step, we see that lattices of subobjects are internally complete (closed under arbitrary internal meets), and therefore also internally cocomplete (closed under internal joins), and this puts us on the path to constructing logical operators on the “positive formula” side: $\vee$, $\exists$, $\bigcup$.
Finally we put the internal logic to use to construct finite colimits. The problem is essentially figuring out how to exploit “local colimits” (joins in subobject lattices) to create “global colimits”. For example, coproducts are obtained by figuring out how to create disjoint unions. (This approach, together with stability, makes automatic the fact that toposes are extensive categories.) For coequalizers, the key turns out to be constructing first the image factorization, whence we proceed essentially by a series of familiar set-theoretic maneuvers that guide us towards constructing coequalizers of equivalence relations. We’ll save the punch lines for later, but the fact that colimits are stable leads quickly to the conclusion that toposes are effective regular categories, as a natural by-product, and with this we may conclude our account.
In theory this account could be read by someone with only slight prior exposure to topos theory, or even just a decent course in basic category theory, but the readers I am actually most interested in reaching are those who already have some experience with topos theory as student and perhaps also as lecturer^{4}, and who share my feeling that there is indeed something formidable about the standard developments, and that a different approach might be worth a shot. I’m hoping this article could even serve as a detailed source of working notes for someone who is considering teaching elementary topos theory.
As intimated above, the development I have undertaken here owes a great deal to my PhD adviser Myles Tierney, to whom I dedicate this work. I also thank Tom Leinster and David Roberts for expressing their interest in this approach, which in the end encouraged me to finish writing this up.
We adopt the power-set or power-object definition of elementary topos:
A topos is a finitely complete category $\mathbf{E}$ such that for every object $A$ there is another object $P A$ and a “relation” $\ni_A \hookrightarrow P A \times A$ that is universal: every relation $j: R \hookrightarrow B \times A$ is a pullback
for some unique map $\chi_j: B \to P A$, called the characteristic map of the relation $j$.
Intuitively, the power-set definition of topos concerns the interplay between categories of sets and relations and the category of sets and functions, which maps faithfully into sets and relations: it says that this faithful “inclusion” functor
has a right adjoint $P$. Here “relations” from $B$ to $A$, denoted $R: B \nrightarrow A$, are by definition subobjects of $B \times A$, i.e., isomorphism classes of monomorphisms into $B \times A$. The inclusion functor $i$ takes a function $f: X \to Y$ to its corresponding functional relation $i(f) = (X \stackrel{\langle 1_X, f \rangle}{\hookrightarrow} X \times Y$).
With the hindsight (foresight?) that toposes are indeed regular categories, with the consequence that relations may be decently composed and $Rel(\mathbf{E})$ forms a category, this observation becomes a precise definition: a topos is a regular category such that $i: \mathbf{E} \to Rel(\mathbf{E})$ has a right adjoint. The components of the counit of $i \dashv P$ are relations of the form
with the intent that any relation $R: i(B) \nrightarrow A$ factors uniquely through $\ni_A$ via the underlying relation of a function $\chi_R: B \to P A$:
Exploring this further, the unit of the adjunction $A \to P i A$, which is the classifying map of the identity relation $1_A: A \nrightarrow A = \delta_A: A \to A \times A$, is the singleton function $\sigma_A: A \to P A$. Looking further ahead, the multiplication of the monad $P$ (now suppressing $i$‘s) is the classifying map $P P A \to P A$ of the composite relation:
Set-theoretically, the composite relation is $\{(\mathcal{F}, a): \exists_{F: P A} \mathcal{F} \ni F \wedge F \ni a\}$; its classifying map is the union operation $\bigcup_A: P P A \to P A$.
Even without the regularity assumption, we can interpret the power object axiom as an elementary (i.e., $Set$-free) reformulation of a statement saying that functors of type $Rel(i-, A): \mathbf{E}^{op} \to Set$ are representable.
All the basic operations pertaining to Algebraic Set Theory – power set, membership, union, singleton – thus find their places as structural data of the adjunction $i \dashv P$; thus our definition of topos can be seen as providing an explanatory framework for their introduction. It’s a slightly different set-up in AST since there a ZF universe $V$ is a large poset in which the singleton operation is of type $s: V \to V$, and the membership relation $P_{small} V \nrightarrow V$ is derived from $s$ and the partial ordering of $V$. The comparison and difference is probably worth exploring further, but we won’t carry this out here.
The power $P 1$ of the terminal object $1$ plays a special role; it is familiarly known as the subobject classifier $\Omega$.
The domain $\ni_1$ of the universal relation $\ni_1 \hookrightarrow P 1 \times 1$ is itself terminal.
A map $X \to \ni_1$, that is, a map $\chi : X \to P(1)$ which factors through $\ni_1 \hookrightarrow P(1)$, corresponds exactly to a pulled-back subobject $\chi^\ast(\ni_1) \subseteq X$ that contains all of $X$ (i.e., the subobject $1_X : X \subseteq X$). Since the only such subobject is $1_X$, there is exactly one map $X \to \ni_1$.
Thus the universal subobject is of the form $t: 1 \to \Omega$. An alternative way of stating the universality is that every monic $i: A \hookrightarrow X$ is the equalizer of a diagram
for some unique choice of map $\chi_i: X \to \Omega$. This has several immediate consequences:
Every monomorphism in a topos is regular (an equalizer of some parallel pair).
A topos is a balanced category.
Every epic mono is an epic equalizer, and such must be an isomorphism.
Any two epi-mono factorizations of a map (if such exist) are canonically isomorphic.
Suppose $i p = j q$ where $p, q$ are epic and $i, j$ are monic. Since $j$ is regular, it is the equalizer of some parallel pair $f, g$ as in the diagram
so that $f i p = f j q = g j q = g i p$, whence $f i = g i$ since $p$ is epic, whence $i$ factors through $j$ as $j$ is the equalizer: $i = j k$ for some $k: B \to C$. Then also $k p = q$ since $j k p = i p = j q$ and $j$ is monic. We have that $k$ is monic since $i$ is, and $k$ is epic since $q$ is. Thus $k$ is an isomorphism.
There is of course another familiar definition of elementary topos, in which cartesian closure and the subobject classifier $\Omega$ are basic. To get from Definition 1 to the other definition involves some slightly fiddly details (the other direction is easier), but basically the cartesian closure comes about in two steps:
so that $P(A \times B)$ satisfies the universal property expected of $(P A)^B$. In other words, cartesian closure in $\mathbf{E}$ is closely related to the compact closure of $Rel(\mathbf{E})$. The detailed proof of this step is easily extracted from this short computation.
In particular, we have a natural isomorphism $P A \cong (P 1)^A = \Omega^A$. Read this way, we have contravariance in the argument $A$, i.e., $A \mapsto P A$ is the object part of the contravariant power object functor.
where the subdiagram consisting of the three objects $P A$, $P 1$, $1$ and the two morphisms between them admit exponentials with exponent $B$. It is easy to verify that the pullback of the exponentiated subdiagram gives the exponential $A^B$.
For this we need to show that the singleton map $\sigma A$ is monic.
The map $\sigma A = \chi_{\delta A}: A \to P A$ is monic.
This is obvious from the point of view that the unit $\sigma: 1 \to P i$ of an adjunction $i \dashv P$ is monic whenever the left adjoint $i$ is faithful. The proof below is a more formal working out.
Given $f \in \hom(X, A)$, the map $\chi_{\delta A} \circ f: X \to P A$ classifies the subobject of $X \times A$ obtained by pulling back $\delta: A \to A \times A$ along $f \times 1_A$. This is
Thus, if for $f, g \in \hom(X, A)$ we have $\chi_{\delta A} \circ f = \chi_{\delta A} \circ g$, then the subobjects coincide: $\langle 1_X, f \rangle = \langle 1_X, g \rangle: X \to X \times A$. This forces $f = g$.
A major qualitative distinction between the vanilla category of sets $Set$ and a general topos $\mathbf{E}$ is that sets $X$ are determined by their points $1 \to A$ (and functions are determined by what they do to points). For a general topos $\mathbf{E}$, this is much too restrictive; for example, sheaves of sets generally form a topos, but a general sheaf is hardly determined by its points $1 \to A$ (which are global sections of $A$). In fact, $A$ may be a highly nontrivial sheaf and yet not have any such points at all: consider for example the principal $\mathbb{Z}/(2)$-bundle over $S^1$ associated with the Möbius strip, living in $Sheaves(S^1)$, which has no global sections.
Instead, in accordance with the Yoneda lemma, objects $A$ are determined up to isomorphism by maps $X \to A$ into $A$, which one sometimes calls “generalized points”. Thus, instead of constant points $1 \to A$, we should consider variable points $X \to A$ with variation over other domains $X$. But in fact these can be reconstituted again as ordinary points $1 \to A$ if we switch our point of view to a universe of “variable objects” $\mathbf{E}/X$. That is to say: an object $A$ of $\mathbf{E}$ can be reinterpreted along a map $X^\ast: \mathbf{E} \to \mathbf{E}/X$ which sends $A$ to $\pi: X \times A \to X$. The generic point $1$ is thereby interpreted in $\mathbf{E}/X$ as $X^\ast 1 = 1_X: X \to X$, and then a point or global section $X^\ast 1 \to X^\ast A$ in $\mathbf{E}/X$ is exactly tantamount to a generalized point $X \to A$ back in $\mathbf{E}$.
So we can, in topos theory, perform arguments and constructions naively as if with “points” $1 \to A$, provided that we consider them uniformly across all slices $\mathbf{E}/X$.^{5}
So our first major task is to prove the slice theorem, which says that if $\mathbf{E}$ is a topos, then also the slice $\mathbf{E}/X$ of objects “varying over $X$” is a topos, and moreover that the interpretation functor $X^\ast: \mathbf{E} \to \mathbf{E}/X$ preserves all conceivable topos structure, i.e., is a “logical functor” and has adjoints on both sides (so that any limits or colimits that $\mathbf{E}$ has are preserved under reinterpretation).
Notice that this point of view signals a major departure from ZFC since, for example, a slice of a model of ZFC (more precisely, a category $Set$ formed from a model of ZFC) is not itself a model of ZFC; rather it is some Boolean-valued model. Of course, toposes are far more general still than mere slices of ZFC models, but the takeaway here is that the notion of topos is closed with respect to moving to universes of greater variability, while still retaining fundamental set-like features.
It is not hard to guess what the topos structure of $\mathbf{E}/X$ looks like, based on the special case $\mathbf{E} = Set$. In this special case, an object $f: Y \to X$ may be regarded as equivalent to a function $X \to Set$ which sends $x \in X$ to the fiber $Y_x = f^\ast(\{x\})$, and indeed $Set/X$ is equivalent to the category $Set^X$ of functors $X \to Set$ (from $X$ as discrete category). Here $Set^X \simeq Set/X$ carries obvious pointwise-defined topos structure, so that the power object of $f: Y \to X$ is the fiberwise power set $x \mapsto P(Y_x)$. Thus $P(Y_x)$ is the fiber over $x$ of the power object.
Assembling these fibers into a total set $P_X(Y, f) \to X$ over $X$, we may describe elements of $P_X(Y, f)$ as pairs $(x \in X, S \in P(Y))$ such that $S \subseteq Y_x = f^\ast(x)$. This inclusion may be expressed as an equation: consider the function $\langle f, 1 \rangle: Y \to X \times Y$; then the equation is
and so if we like, we can describe the elements of the power object $P_X(Y, f)$ as fixed points $(x, S) \in X \times P(Y)$ of the (idempotent) operator $X \times P Y \to X \times P Y$ defined by the composite
Considered as a set $P_X(Y, f) \to X$ over $X$, such a fixed point $(x, S)$ of course maps to $x$.
This in fact is how we proceed below (except we generalize even further).
As mentioned in the introduction, the slice theorem and the theorem on lex (= left exact) comonads on a topos can be proved together in one fell swoop. We observe that the functor $X \times -: \mathbf{E} \to \mathbf{E}$ carries a comonad structure with comultiplication induced from $\delta: X \to X \times X$ and counit induced from $!: X \to 1$. A coalgebra of this comonad is a map $\langle f, 1_A \rangle: A \to X \times A$, tantamount to a single datum $f: A \to X$, and coalgebra maps $(A, f) \to (B, g)$ boil down to maps $h: A \to B$ making the obvious triangle commute. In other words, the category of coalgebras of $X \times -: \mathbf{E} \to \mathbf{E}$ is equivalent to the slice $\mathbf{E}/X$.
Or to say it another way, the forgetful functor $\Sigma_X: \mathbf{E}/X \to \mathbf{E}$ is comonadic, with right adjoint the functor $X^\ast: \mathbf{E} \to \mathbf{E}/X$ that takes an object $Y$ to $\pi_1: X \times Y \to X$. One consequence is that $\Sigma_X: \mathbf{E}/X \to \mathbf{E}$ preserves and reflects colimits.
This comonad $X \times -$ is not left exact, but it does preserve pullbacks. The theorem we prove is this:
If $G: \mathbf{E} \to \mathbf{E}$ is a pullback-preserving comonad acting on a topos $\mathbf{E}$, then the category of coalgebras $\mathbf{E}_G$ is also a topos.
It suffices to produce finite limits and power objects in $\mathbf{E}_G$. Pullbacks in $\mathbf{E}_G$ are created by the forgetful functor $\mathbf{E}_G \to \mathbf{E}$ because $G$ preserves pullbacks, and $G 1$ is terminal in $\mathbf{E}_G$ if $1$ is terminal in $\mathbf{E}$, since the right adjoint $G-: \mathbf{E} \to \mathbf{E}_G$ preserves limits. So $\mathbf{E}_G$ is finitely complete.
Since $G: \mathbf{E} \to \mathbf{E}$ preserves pullbacks, it also preserves monos. In particular, if $\ni_Y \hookrightarrow P Y \times Y$ denotes the universal relation, we have a chain of monos
where the middle two objects are pullbacks of the diagram
in $\mathbf{E}$. The composite of the monos in (1) names a subobject of $G P Y \times G Y$, which is classified by a map
in $\mathbf{E}$.
Let $\delta: G \to G G$ denote the comultiplication and $\varepsilon: G \to 1_\mathbf{E}$ the counit of $G$, and suppose $Y$ carries a $G$-coalgebra structure $\eta: Y \to G Y$. Define the object $P_G Y$ in $\mathbf{E}_G$ to be the equalizer in $\mathbf{E}_G$ of the following pair of coalgebra maps from the cofree coalgebra $G P Y$ to itself:
where $P$ is the contravariant power object functor (whence the direction $P \eta: P G Y \to P Y$). We will show that $P_G Y$ is the power object of $Y$ in $\mathbf{E}_G$.
So, let $(X, \theta: X \to G X)$ be a coalgebra. We must show that coalgebra maps $f: X \to P_G Y$ correspond precisely to subcoalgebras of the product $X \times Y$ in $\mathbf{E}_G$ (whose underlying object in $\mathbf{E}$ is the pullback $X \times_{G 1} Y$). The remainder of the proof will be proven with the help of two elementary lemmas.
Suppose $G: \mathbf{E} \to \mathbf{E}$ is a comonad which preserves pullbacks, and let $(S, \gamma: S \to G S)$ be a $G$-coalgebra. Then a pair $(k: R \to S, \beta: R \to G R)$ of arrows of $\mathbf{E}$ defines a subcoalgebra of $S$ precisely when $k$ is monic and the compatibility square
commutes.
First, $G k$ is monic assuming $k$ is monic, since $G$ preserves pullbacks. Therefore $\beta$ is uniquely determined. We must show that $\beta: R \to G R$ satisfies the axioms for a coalgebra structure, i.e., that
Since $k$ is monic, the second equation follows from the equation
but this equation holds since
using in turn naturality of $\varepsilon$, the square (3), and the fact that $S$ is a coalgebra.
Again, $G$ preserves pullbacks, so $G G k$ is monic assuming $k$ is. So the first equation of (4) follows from
and this equation holds from a series of equations
which completes the proof.
In the notation of Lemma 2, if $k: R \to S$ is a monic coalgebra map, then the square
is a pullback.
Suppose we have a commutative square
First, we prove that $f$ equalizes the pair $\delta R, G\beta: G R \to G G R$. We have a series of equations
using, in turn, (5) (and functoriality of $G$), (6), the coassociativity on the coalgebra $S$, (6) again, and naturality of $\delta$. Comparing the first and last terms of this series, and using the fact that $G G k$ is monic (because $k$ is monic and $G$ preserves pullbacks), we obtain $\delta R \circ f = G\beta \circ f$.
Thus $f$ factors through the equalizer $\beta: R \to G R$ of $\delta R, G \beta: G R \stackrel{\to}{\to} G G R$. Write $f = \beta \circ p$. We then have
and since $\gamma$ is monic (because, e.g., $\varepsilon S$ retracts it), we infer that $k \circ p = g$. Moreover, this $p$ is unique since $k$ is monic. Thus, we have shown that any square (6) factors uniquely, via the map $p: X \to R$, through the square (5); this completes the proof.
We return to the proof of Theorem 1; it remains to show that the equalizer $j: P_G(Y) \to G P Y$ of the pair of maps in (2) is the power object of the coalgebra $(Y, \eta: Y \to G Y)$. Let $(X, \theta: X \to G X)$ be a coalgebra, and $f: X \to P_G(Y)$ a coalgebra map. Then $g = j f: X \to G P Y$ equalizes the maps in (2). Moreover, since $G P Y$ is cofree, we have that to each $G$-coalgebra map $g: X \to G P Y$, there exists a unique $h: X \to P Y$ in the topos $\mathbf{E}$ such that
Thus $g: X \to G P Y$ in $\mathbf{E}_G$ or $h: X \to P Y$ in $\mathbf{E}$ corresponds to a subobject $i: R \hookrightarrow X \times Y$ in $\mathbf{E}$. We need to check that the condition that $g$ equalizes the pair of maps in (2) corresponds precisely to the condition that $R$ is a subcoalgebra of $X \times_{G 1} Y$ (the product of $X$ and $Y$ in $\mathbf{E}_G$). Or, taking into account Lemmas 2 and 3: precisely to the condition that there is a pullback diagram in $\mathbf{E}$ of the form
where $\gamma$ is the coalgebra structure on $X \times_{G 1} Y$, and $k$ is monic.
That $g: X \to G P Y$ equalizes the pair of maps in (2) translates to a condition on $h: X \to P Y$, that the following diagram commutes:
Or, that the subobject $i: R \hookrightarrow X \times Y$ equals the subobject classified by
First, remembering how $\phi$ was defined, the subobject classified by $\phi \circ G h: G X \to P G Y$ appears as the left-hand column in a composite pullback
Then (postcomposing $\phi \circ G h$ by $P\eta$ and precomposing with $\theta$), the condition is that the subobject of $X \times Y$ classified by $P\eta \circ \phi \circ G h \circ \theta$, given by the left-hand column in the large pullback diagram below, matches the subobject $i: R \hookrightarrow X \times Y$:
This condition amounts to saying that the top left square is a pullback (and that $i: R \to X \times Y$ factors through the inclusion $X \times_{G 1} Y \hookrightarrow X \times Y$), since the other three squares are obviously pullbacks already if $G$ preserves pullbacks. That the top left square is a pullback is tantamount to having a pullback (through which the top left square factors)
precisely as in (7). Thus the proof of Theorem 1 is complete.
This theorem guarantees that slices of toposes are toposes. It also guarantees that categories of internal presheaves of a topos are toposes, which is another level of variability. To outline the way this works: an internal category is identified with a monad in a bicategory of internal spans, and internal presheaves over that category with algebras of that monad. This monad has a right adjoint, and such carries a structure of left exact comonad, whose coalgebras are naturally identified with algebras of the monad. Thus in summary, the category of presheaves $\mathbf{E}^{C^{op}}$ of an internal category $C$ is comonadic over the slice $\mathbf{E}/C_0 \times C_0$, and the theorem above applies. Of course all of this is well covered in the standard texts of topos theory (or at least in the good ones!), so we won’t go into it further here.
This theorem can be generalized still further, in such way to specialize also to the result that the category of algebras of a lex idempotent monad on a topos is again a topos. This covers a further level of generality, by allowing passage from toposes of presheaves to categories of sheaves, which form such a category of algebras. Thus this omnibus theorem covers in one fell swoop three major theorems of topos theory, covering the passage from toposes to slices, from slices to presheaves, and from presheaves to sheaves. Such an omnibus theorem may be developed either by using Toby Kenney’s theory of diads on a topos, or the present author’s “modal operators” on a topos.
The hard part is over. Much simpler is the observation that in the slice topos case, the right adjoint $X^\ast: \mathbf{E} \to \mathbf{E}/X$ to the forgetful functor $\Sigma_X: \mathbf{E}/X \to \mathbf{E}$ is a logical functor, i.e., preserves finite limits and power objects.
Of course it preserves finite limits, being a right adjoint. The other claim is that for any object $Y$ of $\mathbf{E}$, the power object of $X^\ast Y$, i.e. of the object $\pi_1: X \times Y \to X$ in $\mathbf{E}/X$, is $\pi_1: X \times P Y \to X$. This can be checked directly: if $f: A \to X$ is any object in $\mathbf{E}/X$, then the product of $f: A \to X$ and $\pi_1: X \times Y \to X$ is the composite $A \times Y \stackrel{\pi_1}{\to} A \stackrel{f}{\to} X$, and any map into the claimed power object,
has as its essential datum a map $\chi: A \to P Y$ in $\mathbf{E}$, corresponding to subobjects of $A \times Y$ (in $\mathbf{E}$, but which we may view as lying over $X$ as above). Thus such maps are in natural bijection with subobjects of $A \times Y$ as an object in $\mathbf{E}/X$, as was to be shown.
We see now that for a topos $\mathbf{E}$, each slice $\mathbf{E}/X$ is cartesian closed (and of course finitely complete). We say that $\mathbf{E}$ is locally cartesian closed.
If $\mathbf{E}$ is finitely complete and cartesian closed, then the functor $X^\ast: \mathbf{E} \to \mathbf{E}/X$ has a right adjoint $\Pi_X: \mathbf{E}/X \to \mathbf{E}$.
For an object $f: Y \to X$ of $\mathbf{E}/X$, let $\Pi_X (f)$ be the object of sections of $f$, i.e., the equalizer of the diagram
Maps $A \to \Pi_X(f)$ in $\mathbf{E}$ are in natural bijection with maps $h: A \to Y^X$ such that $f^X \circ h = [1_X] \circ ! \circ h$. Using the adjunction $X \times - \dashv (-)^X$, such maps are in natural bijection with maps $\hat{h}: X \times A \to Y$ such that
or in other words, maps $X^\ast A \to f$ in $\mathbf{E}/X$.
Therefore $X^\ast: \mathbf{E} \to \mathbf{E}/X$, in addition to being a logical functor between toposes, is also a left adjoint and a right adjoint.
This may be relativized further, by observing that if $f: Y \to X$ is an object of the topos $\mathbf{E}/X$, then
and the functor $f^\ast: \mathbf{E}/X \to \mathbf{E}/Y$, which may be described as pulling back along $f$, is also a logical functor and has adjoints on each side, $\Sigma_f \dashv f^\ast \dashv \Pi_f$. A powerful consequence is that any colimits in $\mathbf{E}/X$, which are computed as in $\mathbf{E}$ by Remark 3, are preserved by pulling back along morphisms $f: Y \to X$. We say that colimits are stable under pullback.
For instance, once we prove that coproducts $X + Y$ exist, with monic and disjoint coproduct inclusions $X \to X + Y$ and $Y \to X + Y$, then armed with the fact that coproducts are stable under pullback, we obtain the result that toposes $\mathbf{E}$ are extensive categories.
Similarly, once we show that coequalizers exist, we infer (from their stability under pullback) that toposes are regular categories. As part of this, we will exhibit the (regular epi)-mono factorization of a map $f: X \to Y$.
One respect which sets topos theory apart from ordinary set theory is that a topos has its own form of “internal logic”. In the most familiar topos $Set$, this logic will be the one we are all familiar with: good old-fashioned logic with two truth values $true$ and $false$, formulas which are manipulated according to the laws of Boolean algebra, and quantification with the familiar meanings of “all” and “some”.
However, in working with a general topos, some of these expectations must be adjusted. In general there may be many truth values (which are points $1 \to \Omega$), and some of the cozy comforts of classical logic (the law of excluded middle for example) must be abandoned in general – not out of some whim or philosophical attraction to intuitionism, but because that’s simply the way things turn out to be in the inner world of a general topos.
For people whose experience with formal logic has been in terms of truth tables and the laws of Boolean algebra, and various reductions which say that the operations $\neg, \wedge, \forall$ are enough, there is inevitably a learning curve here. Readers of standard texts may be assisted by various formal devices which give rules for reasoning in a general topos, such as the Mitchell-Bénabou language and Kripke-Joyal semantics. All that is fine, but what I want to emphasize in these notes is what logic (broadly conceived) is really, to use the words of Lawvere: an interlocking system of certain types of adjoint functors. The sooner one thinks of the logical operations as no longer described by truth table assignments or Venn diagrams and the like but directly in terms of universal properties, the quicker the process will be.
Thus, internal logic is simply a reflection of the categorical properties one can infer about posets of subobjects ordered by inclusion, and functors (such as pullback functors) between them. For example, logical conjunction reflects meets which are cartesian products in $Sub(X)$; logical disjunction reflects joins which are coproducts in $Sub(X)$. Logical implication reflects how $Sub(X)$ is (as we shall see) cartesian closed. And quantification, according to one of the great lessons taught by Lawvere, is a reflection of there being an adjoint (on one side or the other) to pullback functors of the form $f^\ast: Sub(Y) \to Sub(X)$. The rules of inference for logic in a general topos are just those we can derive from such adjoint relationships and general results such as the slice theorem, which guarantees that pullback functors $f^\ast: \mathbf{E}/Y \to \mathbf{E}/X$ are logical functors with adjoints on both sides. Such rules will be developed as we go along.
In developing of the internal logic of a topos $\mathbf{E}$, a useful heuristic^{6} is that the subobject classifier $\Omega$ represents, more or less by definition, the contravariant functor $Sub: \mathbf{E}^{op} \to Set$ mapping an object $X$ to its set of subobjects $Sub(X)$, and mapping a morphism $f: X \to Y$ in $\mathbf{E}$ to the function $Sub(f) = f^\ast(-): Sub(Y) \to Sub(X)$ that sends a subobject $i$ of $Y$ to its pullback (aka inverse image) along $f$, as a subobject of $X$:
In more detail, the heuristic is first to interpret familiar logical operations, for example conjunction, as natural operations on subobjects, for example the intersection operator
which takes a pair of subobjects to their pullback. This is the external view.
Second, by the representability, this corresponds to a natural transformation
or to a natural transformation $\hom_\mathbf{E}(-, \Omega \times \Omega) \to \hom_\mathbf{E}(-, \Omega)$, which by the Yoneda lemma corresponds to a morphism in $\mathbf{E}$,
called “internal conjunction” or “internal meet”. This is the internal view. Basically one builds up the internal logic of a topos by playing the external and internal views off of each other, building up a repertoire of internal logical operations.
Of course there is nothing that formally implies a topos is locally small or well-powered or anything else that justifies considering functors to $Set$ and their representability; indeed, there should be no dependence on $Set$ if topos theory and its offshoots can serve as an independent formal foundations of mathematics. However, the heuristic is useful as a guide toward writing down formal proofs, and as a pedagogical device.
Building up the logical operations proceeds in a certain order: in some sense the definition of topos is naturally “biased” towards negative types (finite limits, exponentials) and so it is natural to begin with the correspondingly negative logical operations: $t = true$, $\wedge$, $\Rightarrow$, $\forall$, before tackling the positive operations: $\bot = false$, $\vee$, $\exists$.
Let us now work through this procedure to construct the internal meet in more detail. In the first place, the isomorphism $\hom_\mathbf{E}(-, \Omega) \stackrel{\sim}{\to} Sub(-)$ is given, à la Yoneda, by a representing or universal element in $Sub(\Omega)$. This of course is the universal subobject $t: 1 \hookrightarrow \Omega$.
Now, the external operation $\cap: Sub(X) \times Sub(X) \to Sub(X)$ is natural: given $f: X \to Y$, the diagram
commutes because $f^\ast$ preserves pullbacks. By representability and Yoneda, the natural transformation $\cap$ is classified by a map $\Omega \times \Omega \to \Omega$ produced by chasing an identity element as follows (now using $\mathbf{E}(a, b)$ as abbreviation of $\hom_\mathbf{E}(a, b)$):
Here $\pi_1^{-1}(t) = (1 \times \Omega \stackrel{t \times 1_{\Omega}}{\to} \Omega \times \Omega)$ and $\pi_2^{-1}(t) = (\Omega \times 1 \stackrel{1_{\Omega} \times t}{\to} \Omega \times \Omega)$; their intersection is $1 \times 1 \stackrel{t \times t}{\to} \Omega \times \Omega$. Thus the internal meet is $\wedge = \chi_{t \times t}: \Omega \times \Omega \to \Omega$. The basic properties of $\wedge$ (associativity, commutativity, idempotence) follow immediately from the corresponding properties of the external operation $\cap$ (which in turn are argued by applying universal properties of pullback). The unit of $\wedge$ is of course $t: 1 \to \Omega$, which is the maximal element (i.e., classifies the maximal subobject $1_\Omega$).
Now we construct an internal implication operator $\Rightarrow: \Omega \times \Omega \to \Omega$, corresponding to a natural external operation $\Rightarrow_X: Sub(X) \times Sub(X) \to Sub(X)$ on subobjects that is appropriately adjoint to $\wedge$, i.e., satisfies
for subobjects $A, B, C$ of $X$.
In this case it is not immediately obvious that there is such an external operation, but obviously it is unique if it exists, and one can try to work out what the corresponding internal operation would look like. We start by working our way to the external view: the internal operation $\Rightarrow$ would be the classifying map of the subobject $\Rightarrow^\ast(1 \stackrel{t}{\to} \Omega)$, and so we proceed to analyze this subobject.
A point $1 \stackrel{\langle u, v \rangle}{\to} \Omega \times \Omega$ belongs to (i.e., factors through) the subobject $\Rightarrow^{-1}(t) \hookrightarrow \Omega \times \Omega$ if and only if $t = u \Rightarrow v$. The same holds true for generalized points $X \stackrel{\langle u, v \rangle}{\to} \Omega \times \Omega$, where now “$t$” is interpreted by pulling back along $X \to 1$, i.e., we define $t_X = X \stackrel{!}{\to} 1 \stackrel{t}{\to} \Omega$: this is the universal subobject $t: 1 \to \Omega$ when interpreted in the topos $\mathbf{E}/X$. (Remember that the pulling-back functor $X^\ast: \mathbf{E} \to \mathbf{E}/X$ is logical.) In general, points $u, v: 1 \to \Omega$ in $\mathbf{E}/X$ correspond to maps $u, v: X \to \Omega$ in $\mathbf{E}$.
Here might be a good place to lay down a precept on interpreting set-theoretic language in topos theory. In set theory, we describe a set $A$ as consisting of points $a \in A$ or, leaning a little closer to categorical notation, as points $a: 1 \to A$. Basically we can say the same thing in topos theory, provided that we are willing to interpret points $a: 1 \to A$ consistently across all slices $\mathbf{E}/X$. Such a point $X^\ast 1 \to X^\ast A$ in the slice $\mathbf{E}/X$ is tantamount to a morphism $X \to A$ in $\mathbf{E}$, and it is reasonable to say that an object $A$ “consists of” (is determined by) such generalized points $X \to A$, in accordance with the Yoneda lemma. So, when we describe an subobject $A'$ of $A$ as consisting of points $a: 1 \to A$ such that (…), that’s what we mean.
Thus, working in a general slice $\mathbf{E}/X$, let us interpret the condition $t = u \Rightarrow v$, passing now to the external picture. If $[u]$ denotes the subobject of the terminal classified by the point $u$, then the external condition $[t] = [u] \Rightarrow [v]$ is equivalent to the condition $[t] \hookrightarrow [u] \Rightarrow [v]$ (since $[t]$ is maximal), which by the adjunction would be equivalent to $[t] \cap [u] \hookrightarrow [v]$, or just $[u] \hookrightarrow [v]$. Let us write this as $[u] \subseteq [v]$, in keeping with set-theoretic notation.
Thus we define $\Rightarrow: \Omega \times \Omega \to \Omega$ to be the classifying map of the subobject of $\Omega \times \Omega$ consisting of points $(u, v)$ such that $[u] \subseteq [v]$ (uniformly over all slices). This last condition coincides with $[u] = [u] \cap [v]$, and so to complete the construction we define the subobject $(\subseteq) \hookrightarrow \Omega \times \Omega$ to be the equalizer of the two maps
It remains to check that this works as advertised: that for points $u, v, w: 1 \to \Omega$ we have (externally)
(again, as always now, this is meant to be interpreted over all slice toposes). The condition on the right means that any (generalized) element $e$ of $[u]$ satisfies $[v(i e)] \subseteq [w(i e)]$, where $i$ is the inclusion $[u] \to 1$. It is sufficient to consider the generic element where $e = 1_{[u]}$, where we have a pullback
from which we see $[v(i e)] = [u] \cap [v]$. Thus the condition on the right is $[u] \cap [v] \subseteq [u] \cap [w]$; as this is equivalent to $[u] \cap [v] \subseteq [w]$, we are done.
So far we have dealt just with propositional logic, internalized in terms of operations $\Omega^n \to \Omega$ within the topos $\mathbf{E}$. These operations pull back to any slice $\mathbf{E}/X$, and externalize to operations on $Sub(X)$ over each individual object (or “type”) $X$.
In predicate logic, quantification involves operations that change the type: if $\phi: X \times A \to \Omega$ is a predicate of type $X \times A$, then quantification over $A$ produces a predicate of type $X$, e.g., $\forall_A \phi: X \to \Omega$. Externally, universal quantification is a natural operation $Sub(X \times A) \to Sub(X)$. Internally, universal quantification is given by a map
which classifies a subobject $\forall_A^\ast(t) \hookrightarrow P A$. An element $\phi: 1 \to P A$ belongs to (factors through) $\forall_A^\ast(t)$ if $t = \forall_A \phi$. In ordinary logic terms, this should mean $\phi(a) = t$ for all elements $a$ of $A$, i.e., that $\phi$ is constantly true over all of $A$: $\phi = t_A = (A \to 1 \stackrel{t}{\to} A)$. This leads us to conclude that $\forall_A^\ast(t) = t_A: 1 \hookrightarrow P A$, and therefore that
The fundamental property of universal quantification, originally observed by Lawvere, is as follows.
Universal quantification is right adjoint to pulling back. At the external level: for $f: X \to P Y$, let $[f]$ denote the subobject of $X \times Y$ classified by $f$. Then for predicates $\phi: X \to P A$ and $\psi: X \to P 1$, we have
At this point we probably have a variety of possible proofs, but in elementary category terms we have by definition a pullback
and so the condition $[\psi] \hookrightarrow [\forall_A \phi]$ is equivalent to commutativity of the diagram
By the adjunction $- \times A \dashv (-)^A$, such a diagram transforms to a commutative diagram
and this is clearly equivalent to the condition $A^\ast [\psi] \hookrightarrow [\phi]$.
The other half of Lawvere’s insight is that existential quantification is left adjoint to pulling back. We will come to this later.
One import of this insight is that quantification has a wider meaning than is found in traditional logic: whereas the traditional quantification $\forall_{a: A} \phi(x, a)$ has to do with a right adjoint to pulling back along a projection map $\pi_1: X \times A \to X$ (in $Set$), in principle we can consider a right adjoint to pulling back along any map $f: Y \to X$. One way to consider this is topos theory is that, working in a slice $\mathbf{E}/X$ and considering an object $f: Y \to X$ ….
Now we put some of the logic we have developed to use, defining an internal intersection operator
which externally takes a family $F \hookrightarrow P A$ to its intersection $\bigcap F \hookrightarrow A$.
To do this, we simply write down the appropriate logical formula for the relation $R \hookrightarrow P P A \times A$ to be classified by $\bigcap$: it consists of points $(F, a): \forall_{S: P A} (F \ni S) \Rightarrow (S \ni a)$. We remember here the precept that this formula is to be interpreted uniformly over all slices $\mathbf{E}/X$.
We construct coproducts in an elementary topos one does in naive set theory: as disjoint unions. Letting $X, Y$ be sets (objects in an ETCS category), a disjoint union of $X$ and $Y$ is a pair of monos
whose intersection in $Sub(Z)$ is empty (i.e., initial), and whose union or join in $Sub(Z)$ is all of $Z$. We will show that disjoint unions exist and are essentially unique, and that they satisfy the universal property for coproducts. We use the notation $X + Y$ for a disjoint union.
We first construct an initial object $0$ by taking the intersection of all subobjects of $1$, i.e., as the subobject $0 \hookrightarrow 1$ classified by the composite
$0$ is the minimal subobject of $1$.
Let $U \subseteq 1$ be any subobject, classified by $u = \chi_U: 1 \to P 1$. The singleton $\sigma u: 1 \to P P 1$ satisfies $\sigma u \leq t_{P 1}$ since $t_{P 1}$ is the internal maximum, and so $\chi_0 = \bigcap t_{P 1} \leq \bigcap \sigma u = u$.
$0$ is initial.
If $X$ is any object, then for arrows $f, g: 0 \rightrightarrows X$ the equalizer $E \to 0$ is a subobject of $0$ and therefore equals $0$ as a minimal subobject of $1$, so $f = g$. This proves uniqueness of arrows $0 \to X$. For existence, the pullback of the mono $\sigma_X: X \to P X$ in the diagram
gives a subobject of $0$ which must be $0$ again, so there is an arrow $0 \to X$ from the pullback, and we are done.
$0 \times X \cong 0$.
The functor $- \times X$ is a left adjoint by cartesian closure, and hence preserves initial objects.
The initial object $0$ is strict, i.e., any arrow $f: X \to 0$ must be an isomorphism. (Therefore, for any object $Y$, the arrow $0 \to Y$ is vacuously monic, and gives the minimum element of $Sub(Y)$.)
Given $f: X \to 0$, the arrow $\langle f, 1_X \rangle: X \to 0 \times X$ has a retraction given by projection $\pi: 0 \times X \to X$. So $X$ is a retract of $0 \times X$ which is initial, making $X$ initial itself.
Incidentally, we may now define the negation operator:
Let $\bot \coloneqq \chi_0: 1 \to \Omega$ be the classifying map of $0 \hookrightarrow 1$. Then negation is defined to be $\neg \coloneqq \chi_\bot: \Omega \to \Omega$. Alternatively, $\neg$ may be defined to be $(-) \Rightarrow \bot$.
For any two objects $X, Y$, a disjoint union of $X$ and $Y$ exists.
It’s enough to embed $X, Y$ disjointly into some $C$, since the union of the two monos in $Sub(C)$ would then be the requisite $Z$. The idea now is that if a disjoint union or coproduct $X+Y$ exists, then there’s a canonical isomorphism $P(X + Y) \cong P X \times P Y$. Since the singleton map
is monic, one thus expects to be able to embed $X$ and $Y$ disjointly into $P X \times P Y$. Since we can easily work out how all this goes in naive structural set theory, we just write out the formulas and check it works.
In detail, define $i_X : X \to P X \times P Y$ to be
where $\sigma_X$ is the singleton mapping and $\chi_0$ classifies $0 \hookrightarrow Y$; similarly, define $i_Y :Y \to P X \times P Y$ to be
Clearly $i_X$ and $i_Y$ are monic, so to show disjointness we just have to show that their pullback is initial. But their pullback is isomorphic to the cartesian product of the pullbacks of the diagrams
so it would be enough to show that each (or just one) of these pullbacks is initial, say the first. Suppose given a map $h: A \to X$ which makes the square
commute. The composite $A \to 1 \stackrel{\chi_0}{\to} P X$ classifies
which is initial. This must be the same subobject as classified by $\sigma_X h = \chi_\delta h$ (where $\delta: X \to X \times X$ is the diagonal), which is
An easy calculation shows this subobject to be the equalizer of the pair of maps
So this equalizer $E$ is initial. But notice $\langle 1, h \rangle: A \to A \times X$ equalizes this pair of maps. Therefore we have a map $A \to E \cong 0$; by Corollary 4, we infer $A \cong 0$. This applies to the case where $A$ is the pullback, so the pullback is initial, as was to be shown.
Any two disjoint unions of $X$ and $Y$ are canonically isomorphic.
Let us consider the disjoint union of $X$ and $Y$ constructed in Theorem 2 as the “standard” disjoint union, which we will denote as $X + Y$. Suppose $Z$ is another disjoint union of $X$ and $Y$, i.e., suppose $i: X \to Z$ and $j: Y \to Z$ are two disjoint embeddings such that their union in $Sub(Z)$ is all of $Z$. We construct a comparison isomorphism $X + Y \cong Z$; we begin by defining a map $\phi = \langle \phi_1, \phi_2 \rangle: Z \to P X \times P Y$, where $\phi_1: Z \to P X$ classifies $\langle i, 1_X \rangle: X \hookrightarrow Z \times X$, and $\phi_2: Z \to P Y$ classifies $\langle j, 1_Y \rangle: Y \hookrightarrow Z \times Y$. It is not hard to see that $\phi i: X \to P X \times P Y$ and $\phi j: Y \to P Y$ coincide with the embeddings $i_X: X \to P X$ and $i_Y: Y \to P Y$ constructed in Theorem 2, using the fact that $i, j$ are disjoint embeddings.
Our first task is to show $\phi$ is monic. If we introduce a pullback
then it suffices to show the pullback projections are equal: $p_1 = p_2$. Or equivalently, that if $k: E \hookrightarrow U$ is the equalizer of $p_1, p_2: U \rightrightarrows Z$, then the subobject $E$ of $U$ is all of $U$.
From $\phi p_1 = \phi p_2$ we have $\phi_1 p_1 = \phi_1 p_2$ and $\phi_2 p_1 = \phi_2 p_2$. So for example, from the first equation we have $(p_1 \times 1_X)^\ast \langle i, 1_X \rangle = (p_2 \times 1_X)^\ast \langle i, 1_X \rangle$ as subobjects of $U \times X$, say $\langle f, g \rangle: W \to U \times X$ as in a pullback
Because those subobjects are the same $\langle f, g \rangle: W \hookrightarrow U \times X$, we deduce that $p_1 f = i g = p_2 f$. Composing the square marked ‘pb’ with the pullback
we have a pullback
and from $p_1 j = p_2 j$ we deduce that $p_1^\ast(X) \hookrightarrow U$ factors through the equalizer $k: E \hookrightarrow U$. Similarly, arguing as above but replacing $\phi_1$ by $\phi_2$, we see $p_1^\ast(Y) \hookrightarrow U$ factors through $E \hookrightarrow U$. So the union $p_1^\ast(X) \cup p_1^\ast(Y)$ is contained in the subobject $E$ of $U$. However, the pullback function $p_1^\ast: Sub(Z) \to Sub(U)$ is a lattice map (it preserves the bottom element and finite unions because it has a right adjoint induced by $\Pi_{p_1}$; cf. local cartesian closure), and so we have
and hence $E = U$ as subobjects of $U$, forcing $p_1 = p_2$ as claimed, and so $\phi$ defines a subobject $Z \hookrightarrow P X \times P Y$.
The argument above shows that $\phi : Z \hookrightarrow P X \times P Y$ is an upper bound of $i_X : X \to P X \times P Y$ and $i_Y : Y \to P X \times P Y$ in $Sub(P X \times P Y)$. It follows that the join $X + Y$ constructed in Theorem 2 is contained in $\phi : Z \hookrightarrow P X \times P Y$, and hence can be regarded as the join of $X$ and $Y$ in $Sub(Z)$. But $Z$ is their join in $Sub(Z)$ by assumption of being a disjoint union, so the inclusion $X + Y \leq Z$ must be an equality. The proof is now complete.
The inclusions $i_X : X \to X + Y$, $i_Y : Y \to X + Y$ exhibit $X + Y$ as the coproduct of $X$ and $Y$.
Let $f : X \to B$, $g : Y \to B$ be given functions. Then we have monos
Now the operation $- \times B : Sub(C) \to Sub(C \times B)$ certainly preserves finite meets, and also preserves finite joins because it is left adjoint to $\forall_B : Sub(C \times B) \to Sub(C)$. Therefore this operation preserves disjoint unions; we infer that the monos
exhibit $(X + Y) \times B$ as a disjoint union of $X \times B, Y \times B$. Composing the monos of (1) and (2), we have disjoint embeddings of $X$ and $Y$ in $(X + Y) \times B$. Using Theorem 3, $X + Y$ is isomorphic to the join of these embeddings; this means we have an inclusion
whose restriction to $X$ yields $\langle i_X, f \rangle$ and whose restriction to $Y$ yields $\langle i_Y, g \rangle$. Hence $h : X + Y \to B$ extends $f$ and $g$. It is the unique extension, for if there were two extensions $h, h'$, then their equalizer would be an upper bound of $X, Y$ in $Sub((X + Y) \times B)$, contradicting the fact that $X + Y$ is the least upper bound. This completes the proof.
A topos is an extensive category.
Coproducts are disjoint by construction and are stable under pullback since pullback functors $f^\ast: \mathbf{E}/B \to \mathbf{E}/A$ preserve colimits by the remarks after Proposition 3.
In this section we prove that a topos admits coequalizers; in conjunction with Theorem 4, this shows that toposes are finitely cocomplete.
But the real motif of this section is the Barr exactness of a topos, which is effectively summarized by a Galois correspondence between equivalence relations on an object $Y$ and regular epimorphisms out of $Y$, stable under pullback. We begin with a few preliminary observations, valid for any category $\mathbf{E}$.
For an object $Y$, let $Rel_Y$ be the class of subobjects of $\langle p_1, p_2\rangle: A \hookrightarrow Y \times Y$. As usual, this class is the partial order reflection of the preorder (thin category) given by the full subcategory of $\mathbf{E}/Y \times Y$ whose objects consist of monomorphisms into $Y \times Y$.
Let $Quot_Y$ be the class of quotients = co-subobjects of $Y$, that is to say equivalence classes of epis $h: Y \twoheadrightarrow Z$. This class is the partial order reflection of the preorder given by the full subcategory of the co-slice $Y/\mathbf{E}$ whose objects consist of epimorphisms out of $Y$ (so $(p: Y \twoheadrightarrow P) \leq (q: Y \twoheadrightarrow Q)$ if there is a map $f: P \to Q$ such that $f \circ p = q$; such an $f$ is unique and necessarily an epi).
There is a fundamental relation between $Rel_Y$ and $Quot_Y$, which we denote as $\perp$, where by definition $\langle p_1, p_2\rangle \perp h$ iff $h p_1 = h p_2$. For $h \in Quot_Y$, let $h^\wedge$ denote the class of $R \in Rel_Y$ such that $R \perp h$. A kernel pair of $h$, if it exists (as it certainly will in any category with pullbacks, for example a topos), is the maximal element of $h^\wedge$. It is denoted $\ker(h)$.
Similarly, for $R = \langle p_1, p_2 \rangle \in Rel_Y$, let $R^\vee$ denote the class of $h \in Quot_Y$ such that $R \perp h$. A coequalizer of the pair $(p_1, p_2)$, if it exists, is the minimal element of $R^\vee$. It is denoted $coeq(p_1, p_2)$ or $quot(R)$.
If kernel pairs and coequalizers exist, then we get a monotone Galois connection between $Rel_Y$ and $Quot_Y$, where $quot: Rel_Y \to Quot_Y$ is left adjoint to $\ker: Quot_Y \to Rel_Y$. That is to say, $quot \langle p_1, p_2 \rangle \leq h$ iff $\langle p_1, p_2 \rangle \leq \ker(h)$, since both sides are equivalent to $h p_1 = h p_2$.
The goal of this section is to show that in order to prove that a topos has coequalizers, it suffices to construct quotients of equivalence relations.
Let $f: X \to Y$ be a map with image $im(f) \hookrightarrow Y$. Then the map $q: X \to im(f)$ through which $f$ factors is epic.
Suppose given $g, h: im(f) \to Z$ such that $g q = h q$. Then $q$ (and thus also $f$) factors through their equalizer, a subobject of $im(f)$. Since $im(f)$ is the smallest subobject through which $f$ factors, the equalizer must be all of $im(f)$, and therefore $g = h$.
In combination with Corollary 2, we derive the following consequence.
Every morphism $f: X \to Y$ has a unique epi-(regular) mono factorization, given by the image factorization $X \to im(f) \hookrightarrow Y$.
For any parallel pair $f, g: X \rightrightarrows Y$, let $\langle p_1, p_2 \rangle: R \hookrightarrow Y \times Y$ be the image of $\langle f, g \rangle: X \to Y \times Y$. Then for any $h: Y \to Z$, we have $h f = h g$ iff $h p_1 = h p_2$.
The map $\langle f, g \rangle$ factors as an epi $\pi: X \to R$ followed by the mono $\langle p_1, p_2 \rangle$. Thus $f = p_1 \pi$ and $g = p_2 \pi$, so obviously $h p_1 = h p_2$ implies, hitting both sides with $\pi$, that $h f = h g$. Conversely, $h f = h g$ implies $h p_1 \pi = h p_2 \pi$ which implies $h p_1 = h p_2$ since $\pi$ is epic.
The coequalizer of the pair $(f, g)$ is the coequalizer of its image pair $(p_1, p_2)$, provided that either coequalizer exists.
The coequalizer of either would be the minimal element in the class of those $h \in Quot_Y$ such that $h f = h g$, equivalently $h p_1 = h p_2$.
The next proposition provides a kind of dual form of the preceding proposition and its corollary.
If $f: Y \to W$ has epi-mono factorization $Y \stackrel{q}{\twoheadrightarrow} Q \stackrel{i}{\rightarrowtail} W$, then $\ker(f) = \ker(q)$.
For $R = \langle p_1, p_2 \rangle \in Rel_Y$, we have $q p_1 = q p_2$ iff $i q p_1 = i q p_2$ since $i$ is monic, so $q p_1 = q p_2$ iff $f p_1 = f p_2$. The kernel pair of either $q$ or $f$ would be the maximal such relation $R$.
For a relation $R = \langle p_1, p_2 \rangle \in Rel_Y$, let $Eq(R) = \langle \pi_1, \pi_2 \rangle$ be the minimal equivalence relation containing $R$, i.e., the intersection of all equivalence relations containing $R$. Then the coequalizer of $(p_1, p_2)$ equals the coequalizer of $(\pi_1, \pi_2)$, assuming either exists.
Clearly, from $R \leq Eq(R)$ we have that $h\pi_1 = h\pi_2$ implies $h p_1 = h p_2$. In the other direction, from $h p_1 = h p_2$ we have that $ker(h)$ is an equivalence relation containing $R$; thus $E \leq ker(h)$ since $E$ is the smallest equivalence relation containing $R$. This immediately implies that $h \pi_1 = h \pi_2$.
Therefore $\langle p_1, p_2 \rangle^\vee = \langle \pi_1, \pi_2 \rangle^\vee$. A coequalizer would be the minimal element of either class.
This proposition shows that to construct coequalizers in a topos, it suffices to construct quotients of equivalence relations.
If $\langle p_1, p_2 \rangle: E \to Y \times Y$ is an equivalence relation, with classifying map $\chi_E: Y \to P Y$, then $\chi_E \circ p_1 = \chi_E \circ p_2$.
This is equivalent to the statement that $(p_1 \times 1)^\ast E = (p_2 \times 1)^\ast E$ as subobjects of $E \times Y$, as in the pullback
Since this is a statement in finite limit logic, it can be proven conceptually by a Yoneda lemma argument where it suffices to check it in the case $\mathbf{E} = Set$, where $(p_1 \times 1)^\ast E = (p_2 \times 1)^\ast E$ boils down to checking
which of course follows easily from the axioms for equivalence relations.
If this isn’t already convincing, then the formal proof may be extracted just by chasing through the Yoneda lemma. In slightly greater detail: $P_1 \coloneqq (p_1 \times1)^\ast E$ as a subobject of $Y \times Y \times Y$ is the limit of the diagram
where we may use $v, w, y$ as generic symbols to label the limit cone component $\langle v, w, y \rangle: P_1 \to Y \times Y \times Y$ (the other components are then uniquely determined, by monicity of $i$). The limit condition simply means $\langle v, w \rangle: P_1 \to Y \times Y$ and $\langle v, y \rangle: P_1 \to Y \times Y$ both factor through $i: E \hookrightarrow Y \times Y$; we may abbreviate these conditions as “$(v, w) \in E$ and $(v, y) \in E$”. Similarly, $P_2 \coloneqq (p_2 \times 1)^\ast E$ as a subobject of $Y \times Y \times Y$ is defined similarly as a limit just by replacing $\pi_{13}$ in the diagram above by $\pi_{23}$; the corresponding limit condition abbreviates to “$(v, w) \in E$ and $(w, y) \in E$”, and it becomes clear how the desired statement follows from the definition of internal equivalence relation (partial equivalence relations, dropping the reflexivity axiom, would be enough).
By this proposition, we have that $E \leq ker(\chi_E)$ for equivalence relations $E$.
Equivalence relations in a topos are effective.
The claim is that an equivalence relation $i: E \hookrightarrow Y \times Y$ coincides with the kernel pair of its classifying map. We have already seen from Proposition 11 that $E \leq \ker(\chi_E)$ in $Sub(Y \times Y)$.
To get $\ker(\chi_E) \leq E$, we may again appeal to the heuristic of arguing in finite limit logic where it suffices to check the claim for $\mathbf{E} = Set$. The claim is that if $\chi_E(u) = \chi_E(v)$ for $u, v$ in $Y$, then $(u, v) \in E$. The hypothesis $\chi_E(u) = \chi(v)$ says
where by reflexivity, $u$ is a member of the right side, hence of the left side: $(v, u) \in E$. But then by symmetry, $(u, v) \in E$.
Again we may chase through the Yoneda lemma to extract a formal elementary proof. Let $u, v$ be the generic such pair, i.e., let $\langle u, v \rangle: P \hookrightarrow Y \times Y$ denote the kernel pair of $\chi_E$ as in the pullback
so that by commutativity of the diagram we have $(u \times 1_Y)^\ast E = (v \times 1)^\ast E$ as in the pullback
Then $\langle u, u \rangle = (u \times 1) \circ \langle 1, u \rangle: P \to Y \times Y$ factors through $i$ by reflexivity. Thus $\langle 1, u \rangle: P \to P \times Y$ induces a map $P \to (u \times 1)^\ast E = (v \times 1)^\ast E$, meaning $(v \times 1)\circ \langle 1, u \rangle = \langle v, u \rangle$ factors through $i: E \to Y \times Y$. By symmetry, $\langle u, v \rangle: P \to Y \times Y$ factors through $i: E \to Y \times Y$, i.e., $\ker(\chi_E) \leq E$.
In this subsection we construct quotients of equivalence relations.
Let $f: Y \to W$ be a map, and let $f^\ast$ denote the opposite relation $\langle f, 1_Y \rangle: Y \hookrightarrow W \times Y$, with characteristic map $\chi_{f^\ast}: W \to P Y$. Then the composite
is the characteristic map of $ker(f) \hookrightarrow Y \times Y$.
This follows immediately on examination of the composite pullback diagram
If $q: Y \to Q$ is epic, then $\chi_{q^\ast}: Q \to P Y$ is monic.
Given $f, g \in \hom(A, Q)$, if we suppose that $\chi_{q^\ast} \circ f = \chi_{q^\ast} \circ g$, then pulling back the subobject $\langle q, 1_Y \rangle: Y \to Q \times Y$ along either $f$ or $q$ results in the same subobject of $A \times Y$, say
where the map $r: P \to A$ is the pullback of $q: Y \to Q$ along $f$. Since $q$ is epic, so is $r$ (this uses the fact that pulling back is a left adjoint; cf. local cartesian closure). But now we see $f \circ r = q \circ p = g \circ r$, which implies $f = g$ since $r$ is epic.
If $\langle p_1, p_2 \rangle: E \to Y \times Y$ is an equivalence relation, then the epimorphism $q: Y \to im(\chi_E)$ in the image factorization of $\chi_E: Y \to P Y$ is the coequalizer of the pair $(p_1, p_2)$.
Put $Q = im(\chi_E)$, i.e., let $Y \stackrel{q}{\to} Q \stackrel{i}{\rightarrowtail} P(Y)$ be the unique epi-mono factorization of $\chi_E$. Suppose $h: Y \to Z$ satisfies $h p_1 = h p_2$. We have $E = \ker(\chi_E)$ by Theorem 6 (effectiveness of equivalence relations), and $\ker(\chi_E) = \ker(q)$ by Proposition 9. Therefore, for generalized elements $y, y': U \to Y$ we have $q y = q y'$ iff $\langle y, y' \rangle$ factors through $i: E \hookrightarrow Y \times Y$, which implies $\langle q, h \rangle y = \langle q, h \rangle y'$. On the other hand, $\langle q, h \rangle y = \langle q, h \rangle y'$ trivially implies $q y = q y'$. Thus $E = \ker(\chi_E) = \ker(q) = \ker(\langle q, h\rangle)$.
So, if $f = \langle q, h \rangle: Y \to Q \times Z$ has epi-mono factorization $Y \stackrel{q'}{\twoheadrightarrow} Q' \stackrel{i'}{\rightarrowtail} Q \times Z$, then $ker(f) = \ker(q')$ (Proposition 9 again). Now $\chi_{(q')^\ast}: Q' \to P(Y)$ is monic by Lemma 5, and $\chi_E = \chi_{\ker(f)} = \chi_{\ker(q')}$ has epi-mono factorization $Y \stackrel{q'}{\twoheadrightarrow} Q' \stackrel{\chi_{(q')^\ast}}{\rightarrowtail} P Y$ by Lemma 4. By uniqueness of epi-mono factorizations, we see now that $Q$ equals $Q'$ as quotients of $Y$, and the map $i': Q \to Q \times Z$ must be of the form $\langle 1_Q, r \rangle: Q \to Q \times Z$. The map $r: Q \to Z$ is then the required mediator through which $h$ factors, i.e., $h = r \circ q$. There is of course just one such map because $q$ is epic.
A topos is a regular category. (It is therefore also Barr-exact, since by Theorem 6 all equivalence relations are effective.)
It is more than enough to observe that a topos is finitely complete and has coequalizers, and pullback functors preserve coequalizers (by local cartesian closure). A coequalizer of a pair of maps $f, g: X \rightrightarrows Y$ is constructed as $quot(Eq(im \langle f, g \rangle))$, using Corollary 6 and Proposition 10 and Theorem 7.
A topos is a Heyting pretopos.
As we have seen, a topos is an extensive Barr-exact category whose subobject lattices carry Heyting algebra structures that are preserved by pullback functors.
It follows from the general theory of pretoposes that all epimorphisms in a topos are regular. However, we can also prove this easily from the theory developed above.
Every epimorphism $q: A \to B$ in a topos is a regular epimorphism, indeed an effective quotient.
Theorem 7 together with Lemma 4 and Lemma 5 shows that $q$ is the coequalizer of its kernel pair $\ker(q): E \rightrightarrows A$.
Cf. the ridicule of so-called ‘categorical dys-foundations’ by various logicians at FOM (Foundations of Mathematics list, esp. circa 1997-1998; see e.g. here). This is a pity, since the developments surrounding topos theory have contributed genuine insights into the algebra of logic and set theory. ↩
Colimits in the basic examples of topos theory, such as Grothendieck toposes, are of course usually much easier to grasp. ↩
Developing the internal logic of a topos – which is not at all trivial! – provides an excellent way to appreciate what first-order logic is about from the standpoint of categorical algebra, being “characterized axiomatically by specific interlocking systems of adjoint functors” (Lawvere). ↩
None of this may be new to the real experts; in fact I think there’s a good chance that the development pursued here has occurred to them. Still, a published account of it is nonexistent, as far as I know. ↩
This is reminiscent of the “stack semantics” of a topos, pioneered by Mike Shulman. ↩
‘Heuristic’ in the original sense of “discovery method” (faithful to its Greek etymology), i.e., a general method which guides the user to discover solutions. ↩