Todd Trimble An elementary approach to elementary topos theory




There are many students and researchers outside of category theory who have heard about categorical approaches to “mathematical foundations”, and eager to learn more. Perhaps they heard for instance about “categorical set theory” such as Lawvere’s ETCS: the Elementary Theory of the Category of Sets. Or, they may have heard that topos theory can be seen as “generalized set theory”, and heard about some of its selling points. For example, how categorical set theory is structural, i.e., is set up so that the chief concepts are invariant with respect to isomorphism. From this point of view, the “stuff” that sets are made of (elements, elements of elements, …) is not important, but how sets are interconnected through functions and relations is important, as is the way sets are used by mathematicians which is through universal properties. Another selling point is that topos theory is said to be more flexible and general than ordinary set theory: all the familiar forms of set theory such as ultrapowers, forcing extensions, etc. form toposes, but there are many other toposes of quite different character, worlds of set-like objects tailored more directly to the intuitions and language native to particular problem environments: petit toposes like sheaves over a space, or classifying toposes in algebraic geometry for expressing things like etale cohomology, synthetic differential geometry for nilpotent infinitesimals and function spaces, realizability toposes for recursive function theory, linguistically constructed logical toposes, … This is exciting, and consequently there is a demand and market for introductory accounts.

However, such readers may be dismayed by the technical demands the standard introductory accounts make. A student who turns for example to the (deservedly) well regarded graduate text Sheaves in Geometry and Logic by Mac Lane and Moerdijk, to learn about “First properties of elementary topoi” (Chapter IV), may be surprised by how much categorical background is actually needed. Even to construct very basic things like finite coproducts and quotients, the reader is expected to become familiar with properties of monadic functors, the crude monadicity theorem, Beck-Chevalley conditions, and other technical methods and results – not the type of thing one usually associates with “set theory”. One could understandably walk away believing that topos theory is something only for specialists, nothing at all like set theory but some abstruse and intimidating categorical development. And to set theorists who were told that topos theory and various offshoots offer an alternative and viable categorical approach to set-theoretic foundations, this situation may seem totally out of hand, even ridiculous.1

Even for professional category theorists, the connection between (to them) standard techniques and the naive set-theoretic principles we grew up may seem pretty remote. Coproducts are basically just disjoint unions, right? Do we really need recondite and non-direct arguments to prove their existence in an elementary topos? The standard approach (based on elegant but slightly tricky arguments first published by Paré) is to show that the contravariant power (i.e., power set, power object) functor on a topos is monadic. The monad itself is the double power functor, and the monadicity theorem requires consideration of two or three iterations of the monad, so prima facie up to five or six applications of the power functor are involved in the construction of finite colimits. Meanwhile, in ordinary mathematics, it’s somewhat rare to go beyond three iterations of power set: our brains are not that well equipped to visualize such higher-order constructions, and manipulations become more or less exercises in formal reasoning.

In other words, even after one is convinced that the constructions work, it doesn’t give a very “common sense” way to view coproducts or coequalizers. A likely result is that one treats colimits in elementary toposes as sitting in a black box: they’re there, but almost impossible to see directly.2 To get beyond that point, one has still more work to do (and frequently the textbooks don’t go into this). Surely an uncomfortable situation, particularly in a classroom setting.

My PhD thesis adviser was Myles Tierney; he and Bill Lawvere founded elementary topos theory. Myles and I would sometimes talk about such things: he longed for a “naive topos theory” just as viable and available for use by working mathematicians as ordinary naive set theory. On one occasion – either in his office or during a short elevator ride to his office in Hill Center – he outlined for me a program for obtaining all the exactness properties of a topos (i.e., the finitary hypotheses in Giraud’s theorem characterizing Grothendieck toposes), in a manner more elementary and natively set-theoretic than the accounts I knew from graduate textbooks. This conversation, which occurred decades ago, has been at the back of my mind ever since.

In memory of this fine gentleman and scholar, my goal is to try to reproduce and fill in details of Myles’s brief outline to me. (It was, in my possibly faulty memory, very brief indeed – it took him perhaps as few as 90 seconds to get it across.) I don’t know if the approach I will take here is exactly what he had in mind, or whether it will have more appeal for many who have struggled with the standard treatments. But if nothing else, I hope to prove existence of colimits in a topos (as well as all the other basic properties) in a way closer to common sense reasoning, eschewing technical monadicity theorems and such, and using formulas and constructions that appeal directly to naive set theory intuitions.

These notes will nevertheless give a self-contained account of all of the standard categorical properties of an elementary topos, summarized by saying that a topos is a locally cartesian closed Heyting pretopos (in particular an effective regular extensive category with finite colimits, covering all those conditions of Giraud’s theorem that are “elementary”). To keep this account within a manageable number of pages, I simply must make robust use of the language of categorical algebra. Thus, I make no apology for bandying about properties of adjoint functors or cartesian closed categories or other core concepts of category theory.

A central result of elementary topos theory, and certainly of this account, is the slice theorem: if E\mathbf{E} is a topos, then so is the slice category E/X\mathbf{E}/X over any object XX. (Freyd and Scedrov dub this “the fundamental theorem of topos theory”.) These notes have been arranged so that, save a few preliminary observations, almost everything flows out from this one result. The main constructions for the slice, particularly of power objects, can be justified intuitively without much pain, and for a teacher of an introductory course I think that would be reasonable to go into. The proof of the slice theorem, while not nightmarish, is somewhat technical and so I have relegated it to an appendix. It is not necessary to know the proof to understand and appreciate the consequences.

(Actually what I present is a more powerful but more technical-looking result: that the category of coalgebras of a pullback-preserving comonad on a topos is again a topos, with a weaker hypothesis than “lex” or finite-limit-preserving comonad. This neatly combines the slice theorem with the lex comonad theorem, with hardly more effort than either, and I am convinced this is a good way to go for future topos theory textbooks. However, for some reason a published account of the pullback-preserving comonad result is virtually nonexistent; the result is briefly mentioned in Johnstone’s “Elephant”) but not proven there. On a personal note: this result was something Myles and I worked out for fun when I was a grad student, around 1987 or so, before finding out it was already known to some people at least.)

The main mathematical consequence of the slice theorem is the local cartesian closure of toposes, and more particularly the stability of colimits under pullback, an enormously important fact. More philosophically, the moral of the slice theorem is it provides a theoretical basis and methodological bridge for crossing from naive set theory over to topos theory, and particularly for developing the internal logic of a topos.

Generally, the method starts with writing down set theory propositions in terms of ordinary elements (formulated categorically, as points x:1Xx: 1 \to X). Now, the slice theorem guarantees that whatever is provable for a general topos EE/1\mathbf{E} \simeq \mathbf{E}/1 is also provable for a slice E/U\mathbf{E}/U, and this allows us to relativize statements and constructions: statements formulated in terms of ordinary points 1X1 \to X may be restated and proven for more general “points” UXU \to X over arbitrary domains UU. Finally, by instantiating at the “generic point” 1 X:XX1_X: X \to X, a maneuver reminiscent of the Yoneda lemma, one arrives at the “right” corresponding construction or proposition for topos theory, one that is stable and internally correct.

With these points in mind, we may begin our account proper, starting with the systematic construction of the internal logic.3 At first, the “logic” is quite parsimonious: it’s just equational logic, or logic expressible in terms of products and equalizers. Bit by bit one constructs internal logical operators in a topos, starting with the “negative” operators \wedge, \Rightarrow, and \forall. This gives us enough to write down a formula for internal intersection \bigcap, a crucial step. With this step, we see that lattices of subobjects are internally complete (closed under arbitrary internal meets), and therefore also internally cocomplete (closed under internal joins), and this puts us on the path to constructing logical operators on the “positive formula” side: \vee, \exists, \bigcup.

Finally we put the internal logic to use to construct finite colimits. The main problem is figuring out how to exploit “local colimits” (joins in subobject lattices) to create “global colimits”. For example, coproducts are obtained by figuring out how to create disjoint unions. (This approach, together with stability, makes automatic the fact that toposes are extensive categories.) For coequalizers, the key turns out to be constructing first the image factorization, whence we proceed essentially by a series of familiar set-theoretic maneuvers that guide us towards constructing coequalizers of equivalence relations. We’ll save the punch lines for later, but the fact that colimits are stable leads quickly to the conclusion that toposes are effective regular categories, as a natural by-product, and with this we may conclude our account.

In theory this account could be read by someone with only slight prior exposure to topos theory, or even just a decent course in basic category theory, but the readers I am most interested in reaching are those who already have some experience with topos theory as student and perhaps also as lecturer4, and who share my feeling that there is indeed something formidable about the standard developments, and that a different approach is worth a shot. I’m hoping this article could even serve as a detailed source of working notes for someone who is considering teaching elementary topos theory.

As intimated above, the development I have undertaken here owes a great deal to my PhD adviser Myles Tierney, to whom I dedicate this work. I also thank André Joyal, Tom Leinster, and David Roberts for expressing their interest in this approach, which in the end encouraged me to finish writing this up.


We adopt the power-set or power-object definition of elementary topos:


A topos is a finitely complete category E\mathbf{E} such that for every object AA there is another object PAP A and a “relation” APA×A\ni_A \hookrightarrow P A \times A that is universal: every relation j:RB×Aj: R \hookrightarrow B \times A is a pullback

R A j B×A χ j×1 A PA×A\array{ R & \to & \ni_A \\ _\mathllap{j} \downarrow & & \downarrow \\ B \times A & \underset{\chi_j \times 1_A}{\to} & P A \times A }

for some unique map χ j:BPA\chi_j: B \to P A, called the characteristic map of the relation jj.

Intuitively, the power-set definition of topos concerns the interplay between categories of sets and relations and the category of sets and functions, which maps faithfully into sets and relations: it says that this faithful “inclusion” functor

i:ERel(E)i: \mathbf{E} \hookrightarrow Rel(\mathbf{E})

has a right adjoint PP. Here “relations” from BB to AA, denoted R:BAR: B \nrightarrow A, are by definition subobjects of B×AB \times A, i.e., isomorphism classes of monomorphisms into B×AB \times A. The inclusion functor ii takes a function f:XYf: X \to Y to its corresponding functional relation i(f)=(X1 X,fX×Yi(f) = (X \stackrel{\langle 1_X, f \rangle}{\hookrightarrow} X \times Y).

With the hindsight (foresight?) that toposes are indeed regular categories, with the consequence that relations may be decently composed and Rel(E)Rel(\mathbf{E}) forms a category, this observation becomes a precise definition: a topos is a regular category such that i:ERel(E)i: \mathbf{E} \to Rel(\mathbf{E}) has a right adjoint. The components of the counit of iPi \dashv P are relations of the form

A:iPAA\ni_A: i P A \nrightarrow A

with the intent that any relation R:i(B)AR: i(B) \nrightarrow A factors uniquely through A\ni_A via the underlying relation of a function χ R:BPA\chi_R: B \to P A:

(i(B)RA)=(i(B)i(χ R)i(PA) AA).(i(B) \stackrel{R}{\nrightarrow} A) = (i(B) \stackrel{i(\chi_R)}{\to} i(P A) \; \stackrel{\ni_A}{\nrightarrow} A).

Exploring this further, the unit of the adjunction APiAA \to P i A, which is the classifying map of the identity relation 1 A:AA=δ A:AA×A1_A: A \nrightarrow A = \delta_A: A \to A \times A, is the singleton function σ A:APA\sigma_A: A \to P A. Looking further ahead, the multiplication of the monad PP (now suppressing ii‘s) is the classifying map PPAPAP P A \to P A of the composite relation:

PPA PaPA AAP P A \stackrel{\ni_{P a}}{\nrightarrow} P A \stackrel{\ni_A}{\nrightarrow} A

Set-theoretically, the composite relation is {(,a): F:PAFFa}\{(\mathcal{F}, a): \exists_{F: P A} \mathcal{F} \ni F \wedge F \ni a\}; its classifying map is the union operation A:PPAPA\bigcup_A: P P A \to P A.


Even without the regularity assumption, we can interpret the power object axiom as an elementary (i.e., SetSet-free) reformulation of a statement saying that functors of type Rel(i,A):E opSetRel(i-, A): \mathbf{E}^{op} \to Set are representable.


All the basic operations pertaining to Algebraic Set Theory – power set, membership, union, singleton – thus find their places as structural data of the adjunction iPi \dashv P; thus our definition of topos can be seen as providing an explanatory framework for their introduction. It’s a slightly different set-up in AST since there a ZF universe VV is a large poset in which the singleton operation is of type s:VVs: V \to V, and the membership relation P smallVVP_{small} V \nrightarrow V is derived from ss and the partial ordering of VV. The comparison and difference is probably worth exploring further, but we won’t carry this out here.

The power P1P 1 of the terminal object 11 plays a special role; it is familiarly known as the subobject classifier Ω\Omega.


The domain 1\ni_1 of the universal relation 1P1×1\ni_1 \hookrightarrow P 1 \times 1 is itself terminal.


A map X 1X \to \ni_1, that is, a map χ:XP(1)\chi : X \to P(1) which factors through 1P(1)\ni_1 \hookrightarrow P(1), corresponds exactly to a pulled-back subobject χ *( 1)X\chi^\ast(\ni_1) \subseteq X that contains all of XX (i.e., the subobject 1 X:XX1_X : X \subseteq X). Since the only such subobject is 1 X1_X, there is exactly one map X 1X \to \ni_1.

Thus the universal subobject is of the form t:1Ωt: 1 \to \Omega. An alternative way of stating the universality is that every monic i:AXi: A \hookrightarrow X is the equalizer of a diagram

1 ! t X χ i Ω\array{ & & 1 \\ & \mathllap{!} \nearrow & \downarrow \mathrlap{t} \\ X & \underset{\chi_i}{\to} & \Omega }

for some unique choice of map χ i:XΩ\chi_i: X \to \Omega. This has several immediate consequences:


Every monomorphism in a topos is regular (an equalizer of some parallel pair).


A topos is a balanced category.


Every epic mono is an epic equalizer, and such must be an isomorphism.


Any two epi-mono factorizations of a map (if such exist) are canonically isomorphic.


Suppose ip=jqi p = j q where p,qp, q are epic and i,ji, j are monic. Since jj is regular, it is the equalizer of some parallel pair f,gf, g as in the diagram

A p B q i C j D gf E,\array{ A & \stackrel{p}{\to} & B & & \\ \mathllap{q} \downarrow & & \downarrow \mathrlap{i} & & \\ C & \underset{j}{\to} & D & \stackrel{\overset{f}{\to}}{\underset{g}{\to}} & E, }

so that fip=fjq=gjq=gipf i p = f j q = g j q = g i p, whence fi=gif i = g i since pp is epic, whence ii factors through jj as jj is the equalizer: i=jki = j k for some k:BCk: B \to C. Then also kp=qk p = q since jkp=ip=jqj k p = i p = j q and jj is monic. We have that kk is monic since ii is, and kk is epic since qq is. Thus kk is an isomorphism.

Cartesian closure

There is of course another familiar definition of elementary topos, in which cartesian closure and the subobject classifier Ω\Omega are basic. To get from Definition to the other definition involves some slightly fiddly details (the other direction is easier), but basically the cartesian closure comes about in two steps:

In particular, we have a natural isomorphism PA(P1) A=Ω AP A \cong (P 1)^A = \Omega^A. Read this way, we have contravariance in the argument AA, i.e., APAA \mapsto P A is the object part of the contravariant power object functor.

For this we need to show that the singleton map σA\sigma A is monic.


The map σA=χ δA:APA\sigma A = \chi_{\delta A}: A \to P A is monic.

This is obvious from the point of view that the unit σ:1Pi\sigma: 1 \to P i of an adjunction iPi \dashv P is monic whenever the left adjoint ii is faithful. The proof below is a more formal working out.


Given fhom(X,A)f \in \hom(X, A), the map χ δAf:XPA\chi_{\delta A} \circ f: X \to P A classifies the subobject of X×AX \times A obtained by pulling back δ:AA×A\delta: A \to A \times A along f×1 Af \times 1_A. This is

X f A 1 X,f δ X×A f×1 A A×A.\array{ X & \stackrel{f}{\to} & A \\ _\mathllap{\langle 1_X, f \rangle} \downarrow & & \downarrow_\mathrlap{\delta} \\ X \times A & \underset{f \times 1_A}{\to} & A \times A. }

Thus, if for f,ghom(X,A)f, g \in \hom(X, A) we have χ δAf=χ δAg\chi_{\delta A} \circ f = \chi_{\delta A} \circ g, then the subobjects coincide: 1 X,f=1 X,g:XX×A\langle 1_X, f \rangle = \langle 1_X, g \rangle: X \to X \times A. This forces f=gf = g.

Slice theorem

A major qualitative distinction between the vanilla category of sets SetSet and a general topos E\mathbf{E} is that sets XX are determined by their points 1A1 \to A (and functions are determined by what they do to points). For a general topos E\mathbf{E}, this is much too restrictive; for example, sheaves of sets generally form a topos, but a general sheaf is hardly determined by its points 1A1 \to A (which are global sections of AA). In fact, AA may be a highly nontrivial sheaf and yet not have any such points at all: consider for example the principal /(2)\mathbb{Z}/(2)-bundle over S 1S^1 associated with the Möbius strip, living in Sheaves(S 1)Sheaves(S^1), which has no global sections.

Instead, in accordance with the Yoneda lemma, objects AA are determined up to isomorphism by maps XAX \to A into AA, which one sometimes calls “generalized points”. Thus, instead of constant points 1A1 \to A, we should consider variable points XAX \to A with variation over other domains XX. But in fact these can be reconstituted again as ordinary points 1A1 \to A if we switch our point of view to a universe of “variable objects” E/X\mathbf{E}/X. That is to say: an object AA of E\mathbf{E} can be reinterpreted along a map X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X which sends AA to π:X×AX\pi: X \times A \to X. The generic point 11 is thereby interpreted in E/X\mathbf{E}/X as X *1=1 X:XXX^\ast 1 = 1_X: X \to X, and then a point or global section X *1X *AX^\ast 1 \to X^\ast A in E/X\mathbf{E}/X is exactly tantamount to a generalized point XAX \to A back in E\mathbf{E}.

So we can, in topos theory, perform arguments and constructions naively as if with “points” 1A1 \to A, provided that we consider them uniformly across all slices E/X\mathbf{E}/X.5

So our first major task is to prove the slice theorem, which says that if E\mathbf{E} is a topos, then also the slice E/X\mathbf{E}/X of objects “varying over XX” is a topos, and moreover that the interpretation functor X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X preserves all conceivable topos structure, i.e., is a “logical functor” and has adjoints on both sides (so that any limits or colimits that E\mathbf{E} has are preserved under reinterpretation).

Notice that this point of view signals a major departure from ZFC since, for example, a slice of a model of ZFC (more precisely, a category SetSet formed from a model of ZFC) is not itself a model of ZFC; rather it is some Boolean-valued model. Of course, toposes are far more general still than mere slices of ZFC models, but the takeaway here is that the notion of topos is closed with respect to moving to universes of greater variability, while still retaining fundamental set-like features.

Intuitive view

It is not hard to guess what the topos structure of E/X\mathbf{E}/X looks like, based on the special case E=Set\mathbf{E} = Set. In this special case, an object f:YXf: Y \to X may be regarded as equivalent to a function XSetX \to Set which sends xXx \in X to the fiber Y x=f *({x})Y_x = f^\ast(\{x\}), and indeed Set/XSet/X is equivalent to the category Set XSet^X of functors XSetX \to Set (from XX as discrete category). Here Set XSet/XSet^X \simeq Set/X carries obvious pointwise-defined topos structure, so that the power object of f:YXf: Y \to X is the fiberwise power set xP(Y x)x \mapsto P(Y_x). Thus P(Y x)P(Y_x) is the fiber over xx of the power object.

Assembling these fibers into a total set P X(Y,f)XP_X(Y, f) \to X over XX, we may describe elements of P X(Y,f)P_X(Y, f) as pairs (xX,SP(Y))(x \in X, S \in P(Y)) such that SY x=f *(x)S \subseteq Y_x = f^\ast(x). This inclusion may be expressed as an equation: consider the function f,1:YX×Y\langle f, 1 \rangle: Y \to X \times Y; then the equation is

S=Y xS=f,1 *({x}×S)S = Y_x \cap S = \langle f, 1 \rangle^\ast (\{x\} \times S)

and so if we like, we can describe the elements of the power object P X(Y,f)P_X(Y, f) as fixed points (x,S)X×P(Y)(x, S) \in X \times P(Y) of the (idempotent) operator X×PYX×PYX \times P Y \to X \times P Y defined by the composite

X×PY δ×1 X×X×PY 1×ϕ X×P(X×Y) 1×P(f,1) X×PY (x,S) (x,x,S) (x,{x}×S) (x,f,1 *({x}×S)).\array{ X \times P Y & \stackrel{\delta \times 1}{\to} & X \times X \times P Y & \stackrel{1 \times \phi}{\to} & X \times P(X \times Y) & \stackrel{1 \times P(\langle f, 1 \rangle)}{\to} & X \times P Y \\ (x, S) & \mapsto & (x, x, S) & \mapsto & (x, \{x\} \times S) & \mapsto & (x, \langle f, 1 \rangle^\ast (\{x\} \times S)). }

Considered as a set P X(Y,f)XP_X(Y, f) \to X over XX, such a fixed point (x,S)(x, S) of course maps to xx.

This in fact is how we proceed below (except we generalize even further).

Coalgebras of pullback-preserving comonads

As mentioned in the introduction, the slice theorem and the theorem on lex (= left exact) comonads on a topos can be proved together in one fell swoop. We observe that the functor X×:EEX \times -: \mathbf{E} \to \mathbf{E} carries a comonad structure with comultiplication induced from δ:XX×X\delta: X \to X \times X and counit induced from !:X1!: X \to 1. A coalgebra of this comonad is a map f,1 A:AX×A\langle f, 1_A \rangle: A \to X \times A, tantamount to a single datum f:AXf: A \to X, and coalgebra maps (A,f)(B,g)(A, f) \to (B, g) boil down to maps h:ABh: A \to B making the obvious triangle commute. In other words, the category of coalgebras of X×:EEX \times -: \mathbf{E} \to \mathbf{E} is equivalent to the slice E/X\mathbf{E}/X.


Or to say it another way, the forgetful functor Σ X:E/XE\Sigma_X: \mathbf{E}/X \to \mathbf{E} is comonadic, with right adjoint the functor X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X that takes an object YY to π 1:X×YX\pi_1: X \times Y \to X. One consequence is that Σ X:E/XE\Sigma_X: \mathbf{E}/X \to \mathbf{E} preserves and reflects colimits.

This comonad X×X \times - is not left exact, but it does preserve pullbacks. The theorem we prove is this:


If G:EEG: \mathbf{E} \to \mathbf{E} is a pullback-preserving comonad acting on a topos E\mathbf{E}, then the category of coalgebras E G\mathbf{E}_G is also a topos.


It suffices to produce finite limits and power objects in E G\mathbf{E}_G. Pullbacks in E G\mathbf{E}_G are created by the forgetful functor E GE\mathbf{E}_G \to \mathbf{E} because GG preserves pullbacks, and G1G 1 is terminal in E G\mathbf{E}_G if 11 is terminal in E\mathbf{E}, since the right adjoint G:EE GG-: \mathbf{E} \to \mathbf{E}_G preserves limits. So E G\mathbf{E}_G is finitely complete.

Since G:EEG: \mathbf{E} \to \mathbf{E} preserves pullbacks, it also preserves monos. In particular, if YPY×Y\ni_Y \hookrightarrow P Y \times Y denotes the universal relation, we have a chain of monos

(1)G( Y)G(PY×Y)GPY× G1GYGPY×GY G(\ni_Y) \hookrightarrow G(P Y \times Y) \cong G P Y \times_{G 1} G Y \hookrightarrow G P Y \times G Y

where the middle two objects are pullbacks of the diagram

GY G! GPY G! G1\array{ & & G Y \\ & & \downarrow \mathrlap{G !} \\ G P Y & \stackrel{G !}{\to} & G 1 }

in E\mathbf{E}. The composite of the monos in (1) names a subobject of GPY×GYG P Y \times G Y, which is classified by a map

ϕ:GPYPGY\phi: G P Y \to P G Y

in E\mathbf{E}.

Let δ:GGG\delta: G \to G G denote the comultiplication and ε:G1 E\varepsilon: G \to 1_\mathbf{E} the counit of GG, and suppose YY carries a GG-coalgebra structure η:YGY\eta: Y \to G Y. Define the object P GYP_G Y in E G\mathbf{E}_G to be the equalizer in E G\mathbf{E}_G of the following pair of coalgebra maps from the cofree coalgebra GPYG P Y to itself:

(2)GPY id GPY δPY GPη GGPY Gϕ GPGY\array{ G P Y & \stackrel{id}{\to} & G P Y \\ \mathllap{\delta P Y} \downarrow & & \uparrow \mathrlap{G P\eta} \\ G G P Y & \stackrel{G\phi}{\to} & G P G Y }

where PP is the contravariant power object functor (whence the direction Pη:PGYPYP \eta: P G Y \to P Y). We will show that P GYP_G Y is the power object of YY in E G\mathbf{E}_G.

So, let (X,θ:XGX)(X, \theta: X \to G X) be a coalgebra. We must show that coalgebra maps f:XP GYf: X \to P_G Y correspond precisely to subcoalgebras of the product X×YX \times Y in E G\mathbf{E}_G (whose underlying object in E\mathbf{E} is the pullback X× G1YX \times_{G 1} Y). The remainder of the proof will be proven with the help of two elementary lemmas.


Suppose G:EEG: \mathbf{E} \to \mathbf{E} is a comonad which preserves pullbacks, and let (S,γ:SGS)(S, \gamma: S \to G S) be a GG-coalgebra. Then a pair (k:RS,β:RGR)(k: R \to S, \beta: R \to G R) of arrows of E\mathbf{E} defines a subcoalgebra of SS precisely when kk is monic and the compatibility square

(3)R β GR k Gk S γ GS\array{ R & \stackrel{\beta}{\to} & G R \\ \mathllap{k} \downarrow & & \downarrow \mathrlap{G k} \\ S & \stackrel{\gamma}{\to} & G S }



First, GkG k is monic assuming kk is monic, since GG preserves pullbacks. Therefore β\beta is uniquely determined. We must show that β:RGR\beta: R \to G R satisfies the axioms for a coalgebra structure, i.e., that

(4)δRβ=GββεRβ=1 R. \delta R \circ \beta = G\beta \circ \beta \qquad \varepsilon R \circ \beta = 1_R.

Since kk is monic, the second equation follows from the equation

kεRβ=kk \circ \varepsilon R \circ \beta = k

but this equation holds since

kεRβ=εSGkβ=εSγk=1 Skk \circ \varepsilon R \circ \beta = \varepsilon S \circ G k \circ \beta = \varepsilon S \circ \gamma \circ k = 1_S \circ k

using in turn naturality of ε\varepsilon, the square (3), and the fact that SS is a coalgebra.

Again, GG preserves pullbacks, so GGkG G k is monic assuming kk is. So the first equation of (4) follows from

GGkδRβ=GGkGββG G k \circ \delta R \circ \beta = G G k \circ G\beta \circ \beta

and this equation holds from a series of equations

GGkδRβ = δSGkβ naturalityofδ = δSγk hypothesis = Gγγk Sisacoalgebra = GγGkβ hypothesis = GGkGββ hypothesis;functorialityofG\array{ G G k \circ \delta R \circ \beta & = & \delta S \circ G k \circ \beta & naturality\; of\; \delta \\ & = & \delta S \circ \gamma \circ k & hypothesis \\ & = & G \gamma \circ \gamma \circ k & S\; is\; \text{a}\; coalgebra \\ & = & G \gamma \circ G k \circ \beta & hypothesis \\ & = & G G k \circ G \beta \circ \beta & hypothesis;\; functoriality\; of\; G}

which completes the proof.


In the notation of Lemma , if k:RSk: R \to S is a monic coalgebra map, then the square

(5)R β GR k Gk S γ GS\array{ R & \stackrel{\beta}{\to} & G R \\ \mathllap{k} \downarrow & & \downarrow \mathrlap{G k} \\ S & \stackrel{\gamma}{\to} & G S }

is a pullback.


Suppose we have a commutative square

(6)X f GR g Gk S γ GS\array{ X & \stackrel{f}{\to} & G R \\ \mathllap{g} \downarrow & & \downarrow \mathrlap{G k} \\ S & \stackrel{\gamma}{\to} & G S }

First, we prove that ff equalizes the pair δR,Gβ:GRGGR\delta R, G\beta: G R \to G G R. We have a series of equations

GGkGβf=GγGkf=Gγγg=δSγg=δSGkf=GGkδRfG G k \circ G\beta \circ f = G\gamma \circ G k \circ f = G\gamma \circ \gamma \circ g = \delta S \circ \gamma \circ g = \delta S \circ G k \circ f = G G k \circ \delta R \circ f

using, in turn, (5) (and functoriality of GG), (6), the coassociativity on the coalgebra SS, (6) again, and naturality of δ\delta. Comparing the first and last terms of this series, and using the fact that GGkG G k is monic (because kk is monic and GG preserves pullbacks), we obtain δRf=Gβf\delta R \circ f = G\beta \circ f.

Thus ff factors through the equalizer β:RGR\beta: R \to G R of δR,Gβ:GRGGR\delta R, G \beta: G R \stackrel{\to}{\to} G G R. Write f=βpf = \beta \circ p. We then have

γkp=(5)Gkβp=Gkf=(6)γg\gamma \circ k \circ p \stackrel{(5)}{=} G k \circ \beta \circ p = G k \circ f \stackrel{(6)}{=} \gamma \circ g

and since γ\gamma is monic (because, e.g., εS\varepsilon S retracts it), we infer that kp=gk \circ p = g. Moreover, this pp is unique since kk is monic. Thus, we have shown that any square (6) factors uniquely, via the map p:XRp: X \to R, through the square (5); this completes the proof.

We return to the proof of Theorem ; it remains to show that the equalizer j:P G(Y)GPYj: P_G(Y) \to G P Y of the pair of maps in (2) is the power object of the coalgebra (Y,η:YGY)(Y, \eta: Y \to G Y). Let (X,θ:XGX)(X, \theta: X \to G X) be a coalgebra, and f:XP G(Y)f: X \to P_G(Y) a coalgebra map. Then g=jf:XGPYg = j f: X \to G P Y equalizes the maps in (2). Moreover, since GPYG P Y is cofree, we have that to each GG-coalgebra map g:XGPYg: X \to G P Y, there exists a unique h:XPYh: X \to P Y in the topos E\mathbf{E} such that

g=(XθGXGhGPY)g = (X \stackrel{\theta}{\to} G X \stackrel{G h}{\to} G P Y)

Thus g:XGPYg: X \to G P Y in E G\mathbf{E}_G or h:XPYh: X \to P Y in E\mathbf{E} corresponds to a subobject i:RX×Yi: R \hookrightarrow X \times Y in E\mathbf{E}. We need to check that the condition that gg equalizes the pair of maps in (2) corresponds precisely to the condition that RR is a subcoalgebra of X× G1YX \times_{G 1} Y (the product of XX and YY in E G\mathbf{E}_G). Or, taking into account Lemmas and : precisely to the condition that there is a pullback diagram in E\mathbf{E} of the form

(7)R β GR k Gk X× G1Y γ G(X× G1Y)\array{ R & \stackrel{\beta}{\to} & G R \\ \mathllap{k} \downarrow & & \downarrow \mathrlap{G k} \\ X \times_{G 1} Y & \stackrel{\gamma}{\to} & G(X \times_{G 1} Y) }

where γ\gamma is the coalgebra structure on X× G1YX \times_{G 1} Y, and kk is monic.

Proof of Theorem 1

That g:XGPYg: X \to G P Y equalizes the pair of maps in (2) translates to a condition on h:XPYh: X \to P Y, that the following diagram commutes:

X h PY θ g Pη GX Gh GPY ϕ PGY\array{ & & X & \stackrel{h}{\to} & P Y \\ & \theta \swarrow & \downarrow g & & \uparrow P\eta \\ G X & \stackrel{G h}{\to} & G P Y & \stackrel{\phi}{\to} & P G Y }

Or, that the subobject i:RX×Yi: R \hookrightarrow X \times Y equals the subobject classified by

XθGXGhGPYϕPGYPηPY.X \stackrel{\theta}{\to} G X \stackrel{G h}{\to} G P Y \stackrel{\phi}{\to} P G Y \stackrel{P\eta}{\to} P Y.

First, remembering how ϕ\phi was defined, the subobject classified by ϕGh:GXPGY\phi \circ G h: G X \to P G Y appears as the left-hand column in a composite pullback

GR G( Y) Gi G(X×Y) G(h×1) G(PY×Y) G(π 1),G(π 2) G(π 1),G(π 2) GX×GY Gh×1 GPY×GY\array{ G R & \to & G(\ni_Y) \\ \mathllap{G i} \downarrow & & \downarrow \\ G(X \times Y) & \stackrel{G(h \times 1)}{\to} & G(P Y \times Y) \\ \mathllap{\langle G(\pi_1), G(\pi_2) \rangle} \downarrow & & \downarrow \mathrlap{\langle G(\pi_1), G(\pi_2) \rangle} \\ G X \times G Y & \stackrel{G h \times 1}{\to} & G P Y \times G Y }

Then (postcomposing ϕGh\phi \circ G h by PηP\eta and precomposing with θ\theta), the condition is that the subobject of X×YX \times Y classified by PηϕGhθP\eta \circ \phi \circ G h \circ \theta, given by the left-hand column in the large pullback diagram below, matches the subobject i:RX×Yi: R \hookrightarrow X \times Y:

R GR G( Y) k Gi X× G1Y θ× G1η GX× G1GY G(PY×Y) X×Y θ×η GX×GY Gh×1 GPY×GY\array{ R & \to & G R & \to & G(\ni_Y) \\ \mathllap{k} \downarrow & & G i \downarrow & & \downarrow \\ X \times_{G 1} Y & \stackrel{\theta \times_{G 1} \eta}{\to} & G X \times_{G 1} G Y & \to & G(P Y \times Y) \\ \downarrow & & \downarrow & & \downarrow \\ X \times Y & \stackrel{\theta \times \eta}{\to} & G X \times G Y & \stackrel{G h \times 1}{\to} & G P Y \times G Y }

This condition amounts to saying that the top left square is a pullback (and that i:RX×Yi: R \to X \times Y factors through the inclusion X× G1YX×YX \times_{G 1} Y \hookrightarrow X \times Y), since the other three squares are obviously pullbacks already if GG preserves pullbacks. That the top left square is a pullback is tantamount to having a pullback (through which the top left square factors)

R GR k Gk X× G1Y γ G(X× G1Y)\array{ R & \to & G R \\ \mathllap{k} \downarrow & & \downarrow \mathrlap{G k} \\ X \times_{G 1} Y & \stackrel{\gamma}{\to} & G(X \times_{G 1} Y) }

precisely as in (7). Thus the proof of Theorem is complete.


This theorem guarantees that slices of toposes are toposes. It also guarantees that categories of internal presheaves of a topos are toposes, which is another level of variability. To outline the way this works: an internal category is identified with a monad in a bicategory of internal spans, and internal presheaves over that category with algebras of that monad. This monad has a right adjoint, and such carries a structure of left exact comonad, whose coalgebras are naturally identified with algebras of the monad. Thus in summary, the category of presheaves E C op\mathbf{E}^{C^{op}} of an internal category CC is comonadic over the slice E/C 0×C 0\mathbf{E}/C_0 \times C_0, and the theorem above applies. Of course all of this is well covered in the standard texts of topos theory (or at least in the good ones!), so we won’t go into it further here.


This theorem can be generalized still further, in such way to specialize also to the result that the category of algebras of a lex idempotent monad on a topos is again a topos. This covers a further level of generality, by allowing passage from toposes of presheaves to categories of sheaves, which form such a category of algebras. Thus this omnibus theorem covers in one fell swoop three major theorems of topos theory, covering the passage from toposes to slices, from slices to presheaves, and from presheaves to sheaves. Such an omnibus theorem may be developed either by using Toby Kenney’s theory of diads on a topos, or the present author’s “modal operators” on a topos.

The hard part is over. Much simpler is the observation that in the slice topos case, the right adjoint X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X to the forgetful functor Σ X:E/XE\Sigma_X: \mathbf{E}/X \to \mathbf{E} is a logical functor, i.e., preserves finite limits and power objects.


Of course it preserves finite limits, being a right adjoint. The other claim is that for any object YY of E\mathbf{E}, the power object of X *YX^\ast Y, i.e. of the object π 1:X×YX\pi_1: X \times Y \to X in E/X\mathbf{E}/X, is π 1:X×PYX\pi_1: X \times P Y \to X. This can be checked directly: if f:AXf: A \to X is any object in E/X\mathbf{E}/X, then the product of f:AXf: A \to X and π 1:X×YX\pi_1: X \times Y \to X is the composite A×Yπ 1AfXA \times Y \stackrel{\pi_1}{\to} A \stackrel{f}{\to} X, and any map into the claimed power object,

A f,χ X×PY f π 1 X,\array{ A & \stackrel{\langle f, \chi \rangle}{\to} & X \times P Y \\ & \mathllap{f} \searrow & \downarrow \mathrlap{\pi_1} \\ & & X, }

has as its essential datum a map χ:APY\chi: A \to P Y in E\mathbf{E}, corresponding to subobjects of A×YA \times Y (in E\mathbf{E}, but which we may view as lying over XX as above). Thus such maps are in natural bijection with subobjects of A×YA \times Y as an object in E/X\mathbf{E}/X, as was to be shown.

Consequences of the slice theorem

We see now that for a topos E\mathbf{E}, each slice E/X\mathbf{E}/X is cartesian closed (and of course finitely complete). We say that E\mathbf{E} is locally cartesian closed.


If E\mathbf{E} is finitely complete and cartesian closed, then the functor X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X has a right adjoint Π X:E/XE\Pi_X: \mathbf{E}/X \to \mathbf{E}.


For an object f:YXf: Y \to X of E/X\mathbf{E}/X, let Π X(f)\Pi_X (f) be the object of sections of ff, i.e., the equalizer of the diagram

Y X f X X X ! [1 X] 1\array{ Y^X & \stackrel{f^X}{\to} & X^X \\ & \mathllap{!} \searrow & \uparrow \mathrlap{[1_X]} \\ & & 1 }

Maps AΠ X(f)A \to \Pi_X(f) in E\mathbf{E} are in natural bijection with maps h:AY Xh: A \to Y^X such that f Xh=[1 X]!hf^X \circ h = [1_X] \circ ! \circ h. Using the adjunction X×() XX \times - \dashv (-)^X, such maps are in natural bijection with maps h^:X×AY\hat{h}: X \times A \to Y such that

Y f X h^ π 1 1 X X×A X×! X×1 X,\array{ Y & \stackrel{f}{\to} & X & & \\ \mathllap{\hat{h}} \uparrow & \nearrow \mathrlap{\pi_1} & & \nwarrow \mathrlap{1_X} \\ X \times A & \underset{X \times !}{\to} & X \times 1 & \stackrel{\sim}{\to} & X, }

or in other words, maps X *AfX^\ast A \to f in E/X\mathbf{E}/X.

Therefore X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X, in addition to being a logical functor between toposes, is also a left adjoint and a right adjoint.


This may be relativized further, by observing that if f:YXf: Y \to X is an object of the topos E/X\mathbf{E}/X, then

(E/X)/fE/Y(\mathbf{E}/X)/f \simeq \mathbf{E}/Y

and the functor f *:E/XE/Yf^\ast: \mathbf{E}/X \to \mathbf{E}/Y, which may be described as pulling back along ff, is also a logical functor and has adjoints on each side, Σ ff *Π f\Sigma_f \dashv f^\ast \dashv \Pi_f. A powerful consequence is that any colimits in E/X\mathbf{E}/X, which are computed as in E\mathbf{E} by Remark , are preserved by pulling back along morphisms f:YXf: Y \to X. We say that colimits are stable under pullback.

For instance, once we prove that coproducts X+YX + Y exist, with monic and disjoint coproduct inclusions XX+YX \to X + Y and YX+YY \to X + Y, then armed with the fact that coproducts are stable under pullback, we obtain the result that toposes E\mathbf{E} are extensive categories.

Similarly, once we show that coequalizers exist, we infer (from their stability under pullback) that toposes are regular categories. As part of this, we will exhibit the (regular epi)-mono factorization of a map f:XYf: X \to Y.

Internal logic

One respect which sets topos theory apart from ordinary set theory is that a topos has its own form of “internal logic”. In the most familiar topos SetSet, this logic will be the one we are all familiar with: good old-fashioned logic with two truth values truetrue and falsefalse, formulas which are manipulated according to the laws of Boolean algebra, and quantification with the familiar meanings of “all” and “some”.

However, in working with a general topos, some of these expectations must be adjusted. In general there may be many truth values (which are points 1Ω1 \to \Omega), and some of the cozy comforts of classical logic (the law of excluded middle for example) must be abandoned in general – not out of some whim or philosophical attraction to intuitionism, but because that’s simply the way things turn out to be in the inner world of a general topos.

For people whose experience with formal logic has been in terms of truth tables and the laws of Boolean algebra, and various reductions which say that the operations ¬,,\neg, \wedge, \forall are enough, there is inevitably a learning curve here. Readers of standard texts may be assisted by various formal devices which give rules for reasoning in a general topos, such as the Mitchell-Bénabou language and Kripke-Joyal semantics. All that is fine, but what I want to emphasize in these notes is what logic (broadly conceived) is really, to use the words of Lawvere: an interlocking system of certain types of adjoint functors. The sooner one thinks of the logical operations as no longer described by truth table assignments or Venn diagrams and the like but directly in terms of universal properties, the quicker the process will be.

Thus, internal logic is simply a reflection of the categorical properties one can infer about posets of subobjects ordered by inclusion, and functors (such as pullback functors) between them. For example, logical conjunction reflects meets which are cartesian products in Sub(X)Sub(X); logical disjunction reflects joins which are coproducts in Sub(X)Sub(X). Logical implication reflects how Sub(X)Sub(X) is (as we shall see) cartesian closed. And quantification, according to one of the great lessons taught by Lawvere, is a reflection of there being an adjoint (on one side or the other) to pullback functors of the form f *:Sub(Y)Sub(X)f^\ast: Sub(Y) \to Sub(X). The rules of inference for logic in a general topos are just those we can derive from such adjoint relationships and general results such as the slice theorem, which guarantees that pullback functors f *:E/YE/Xf^\ast: \mathbf{E}/Y \to \mathbf{E}/X are logical functors with adjoints on both sides. Such rules will be developed as we go along.

Internal versus external

In developing of the internal logic of a topos E\mathbf{E}, a useful heuristic6 is that the subobject classifier Ω\Omega represents, more or less by definition, the contravariant functor Sub:E opSetSub: \mathbf{E}^{op} \to Set mapping an object XX to its set of subobjects Sub(X)Sub(X), and mapping a morphism f:XYf: X \to Y in E\mathbf{E} to the function Sub(f)=f *():Sub(Y)Sub(X)Sub(f) = f^\ast(-): Sub(Y) \to Sub(X) that sends a subobject ii of YY to its pullback (aka inverse image) along ff, as a subobject of XX:

f 1(T) T f *i pb i X f Y\array{ f^{-1}(T) & \to & T \\ \mathllap{f^\ast i} \downarrow & pb & \downarrow \mathrlap{i} \\ X & \underset{f}{\to} & Y }

In more detail, the heuristic is first to interpret familiar logical operations, for example conjunction, as natural operations on subobjects, for example the intersection operator

:Sub(X)×Sub(X)Sub(X)\cap: Sub(X) \times Sub(X) \to Sub(X)

which takes a pair of subobjects to their pullback. This is the external view.

Second, by the representability, this corresponds to a natural transformation

hom E(,Ω)×hom E(,Ω)hom E(,Ω)\hom_\mathbf{E}(-, \Omega) \times \hom_\mathbf{E}(-, \Omega) \to \hom_\mathbf{E}(-, \Omega)

or to a natural transformation hom E(,Ω×Ω)hom E(,Ω)\hom_\mathbf{E}(-, \Omega \times \Omega) \to \hom_\mathbf{E}(-, \Omega), which by the Yoneda lemma corresponds to a morphism in E\mathbf{E},

:Ω×ΩΩ,\wedge: \Omega \times \Omega \to \Omega,

called “internal conjunction” or “internal meet”. This is the internal view. Basically one builds up the internal logic of a topos by playing the external and internal views off of each other, building up a repertoire of internal logical operations.


Of course there is nothing that formally implies a topos is locally small or well-powered or anything else that justifies considering functors to SetSet and their representability; indeed, there should be no dependence on SetSet if topos theory and its offshoots can serve as an independent formal foundations of mathematics. However, the heuristic is useful as a guide toward writing down formal proofs, and as a pedagogical device.

Building up the logical operations proceeds in a certain order: in some sense the definition of topos is naturally “biased” towards negative types (finite limits, exponentials) and so it is natural to begin with the correspondingly negative logical operations: t=truet = true, \wedge, \Rightarrow, \forall, before tackling the positive operations: =false\bot = false, \vee, \exists.

Internal meet

Let us now work through this procedure to construct the internal meet in more detail. In the first place, the isomorphism hom E(,Ω)Sub()\hom_\mathbf{E}(-, \Omega) \stackrel{\sim}{\to} Sub(-) is given, à la Yoneda, by a representing or universal element in Sub(Ω)Sub(\Omega). This of course is the universal subobject t:1Ωt: 1 \hookrightarrow \Omega.

Now, the external operation :Sub(X)×Sub(X)Sub(X)\cap: Sub(X) \times Sub(X) \to Sub(X) is natural: given f:XYf: X \to Y, the diagram

Sub(Y)×Sub(Y) Sub(Y) f *×f * f * Sub(X)×Sub(X) Sub(X)\array{ Sub(Y) \times Sub(Y) & \stackrel{\cap}{\to} & Sub(Y) \\ \mathllap{f^\ast \times f^\ast} \downarrow & & \downarrow \mathrlap{f^\ast} \\ Sub(X) \times Sub(X) & \stackrel{\cap}{\to} & Sub(X) }

commutes because f *f^\ast preserves pullbacks. By representability and Yoneda, the natural transformation \cap is classified by a map Ω×ΩΩ\Omega \times \Omega \to \Omega produced by chasing an identity element as follows (now using E(a,b)\mathbf{E}(a, b) as abbreviation of hom E(a,b)\hom_\mathbf{E}(a, b)):

E(Ω×Ω,Ω×Ω) E(Ω×Ω,Ω)×E(Ω×Ω,Ω) Sub(Ω×Ω)×Sub(Ω×Ω) Sub(Ω×Ω) E(Ω×Ω,Ω) 1 Ω×Ω (Ω×Ωπ 1Ω,Ω×Ωπ 2Ω) (π 1 1(1tΩ),π 2 1(1tΩ) π 1 1(t)π 2 1(t) χ π 1 1(t)π 2 1(t).\array{ \mathbf{E}(\Omega \times \Omega, \Omega \times \Omega) & \stackrel{\sim}{\to} & \mathbf{E}(\Omega \times \Omega, \Omega) \times \mathbf{E}(\Omega \times \Omega, \Omega) & \stackrel{\sim}{\to} & Sub(\Omega \times \Omega) \times Sub(\Omega \times \Omega) & \stackrel{\cap}{\to} & Sub(\Omega \times \Omega) & \stackrel{\sim}{\to} & \mathbf{E}(\Omega \times \Omega, \Omega) \\ 1_{\Omega \times \Omega} & \mapsto & (\Omega \times \Omega \stackrel{\pi_1}{\to} \Omega, \Omega \times \Omega \stackrel{\pi_2}{\to} \Omega) & \mapsto & (\pi_1^{-1}(1 \stackrel{t}{\to} \Omega), \pi_2^{-1}(1 \stackrel{t}{\to} \Omega) & \mapsto & \pi_1^{-1}(t) \cap \pi_2^{-1}(t) & \mapsto & \chi_{\pi_1^{-1}(t) \cap \pi_2^{-1}(t)}. }

Here π 1 1(t)=(1×Ωt×1 ΩΩ×Ω)\pi_1^{-1}(t) = (1 \times \Omega \stackrel{t \times 1_{\Omega}}{\to} \Omega \times \Omega) and π 2 1(t)=(Ω×11 Ω×tΩ×Ω)\pi_2^{-1}(t) = (\Omega \times 1 \stackrel{1_{\Omega} \times t}{\to} \Omega \times \Omega); their intersection is 1×1t×tΩ×Ω1 \times 1 \stackrel{t \times t}{\to} \Omega \times \Omega. Thus the internal meet is =χ t×t:Ω×ΩΩ\wedge = \chi_{t \times t}: \Omega \times \Omega \to \Omega. The basic properties of \wedge (associativity, commutativity, idempotence) follow immediately from the corresponding properties of the external operation \cap (which in turn are argued by applying universal properties of pullback). The unit of \wedge is of course t:1Ωt: 1 \to \Omega, which is the maximal element (i.e., classifies the maximal subobject 1 Ω1_\Omega).

Internal implication

Now we construct an internal implication operator :Ω×ΩΩ\Rightarrow: \Omega \times \Omega \to \Omega, corresponding to a natural external operation X:Sub(X)×Sub(X)Sub(X)\Rightarrow_X: Sub(X) \times Sub(X) \to Sub(X) on subobjects that is appropriately adjoint to \wedge, i.e., satisfies

ABCiffAB XCA \cap B \hookrightarrow C \qquad iff \qquad A \hookrightarrow B \Rightarrow_X C

for subobjects A,B,CA, B, C of XX.

In this case it is not immediately obvious that there is such an external operation, but obviously it is unique if it exists, and one can try to work out what the corresponding internal operation would look like. We start by working our way to the external view: the internal operation \Rightarrow would be the classifying map of the subobject *(1tΩ)\Rightarrow^\ast(1 \stackrel{t}{\to} \Omega), and so we proceed to analyze this subobject.

A point 1u,vΩ×Ω1 \stackrel{\langle u, v \rangle}{\to} \Omega \times \Omega belongs to (i.e., factors through) the subobject 1(t)Ω×Ω\Rightarrow^{-1}(t) \hookrightarrow \Omega \times \Omega if and only if t=uvt = u \Rightarrow v. The same holds true for generalized points Xu,vΩ×ΩX \stackrel{\langle u, v \rangle}{\to} \Omega \times \Omega, where now “tt” is interpreted by pulling back along X1X \to 1, i.e., we define t X=X!1tΩt_X = X \stackrel{!}{\to} 1 \stackrel{t}{\to} \Omega: this is the universal subobject t:1Ωt: 1 \to \Omega when interpreted in the topos E/X\mathbf{E}/X. (Remember that the pulling-back functor X *:EE/XX^\ast: \mathbf{E} \to \mathbf{E}/X is logical.) In general, points u,v:1Ωu, v: 1 \to \Omega in E/X\mathbf{E}/X correspond to maps u,v:XΩu, v: X \to \Omega in E\mathbf{E}.


Here might be a good place to lay down a precept on interpreting set-theoretic language in topos theory. In set theory, we describe a set AA as consisting of points aAa \in A or, leaning a little closer to categorical notation, as points a:1Aa: 1 \to A. Basically we can say the same thing in topos theory, provided that we are willing to interpret points a:1Aa: 1 \to A consistently across all slices E/X\mathbf{E}/X. Such a point X *1X *AX^\ast 1 \to X^\ast A in the slice E/X\mathbf{E}/X is tantamount to a morphism XAX \to A in E\mathbf{E}, and it is reasonable to say that an object AA “consists of” (is determined by) such generalized points XAX \to A, in accordance with the Yoneda lemma. So, when we describe an subobject AA' of AA as consisting of points a:1Aa: 1 \to A such that (…), that’s what we mean.

Thus, working in a general slice E/X\mathbf{E}/X, let us interpret the condition t=uvt = u \Rightarrow v, passing now to the external picture. If [u][u] denotes the subobject of the terminal classified by the point uu, then the external condition [t]=[u][v][t] = [u] \Rightarrow [v] is equivalent to the condition [t][u][v][t] \hookrightarrow [u] \Rightarrow [v] (since [t][t] is maximal), which by the adjunction would be equivalent to [t][u][v][t] \cap [u] \hookrightarrow [v], or just [u][v][u] \hookrightarrow [v]. Let us write this as [u][v][u] \subseteq [v], in keeping with set-theoretic notation.

Thus we define :Ω×ΩΩ\Rightarrow: \Omega \times \Omega \to \Omega to be the classifying map of the subobject of Ω×Ω\Omega \times \Omega consisting of points (u,v)(u, v) such that [u][v][u] \subseteq [v] (uniformly over all slices). This last condition coincides with [u]=[u][v][u] = [u] \cap [v], and so to complete the construction we define the subobject ()Ω×Ω(\subseteq) \hookrightarrow \Omega \times \Omega to be the equalizer of the two maps

π 1,:Ω×ΩΩ.\pi_1, \wedge: \Omega \times \Omega \rightrightarrows \Omega.

It remains to check that this works as advertised: that for points u,v,w:1Ωu, v, w: 1 \to \Omega we have (externally)

[u][v][w]iff[u][vw][u] \cap [v] \subseteq [w] \qquad iff \qquad [u] \subseteq [v \Rightarrow w]

(again, as always now, this is meant to be interpreted over all slice toposes). The condition on the right means that any (generalized) element ee of [u][u] satisfies [v(ie)][w(ie)][v(i e)] \subseteq [w(i e)], where ii is the inclusion [u]1[u] \to 1. It is sufficient to consider the generic element where e=1 [u]e = 1_{[u]}, where we have a pullback

[v(ie)] [v] 1 t [u] ie 1 v Ω\array{ [v(i e)] & \to & [v] & \to & 1 \\ \downarrow & & \downarrow & & \downarrow \mathrlap{t} \\ [u] & \underset{i e}{\to} & 1 & \underset{v}{\to} & \Omega }

from which we see [v(ie)]=[u][v][v(i e)] = [u] \cap [v]. Thus the condition on the right is [u][v][u][w][u] \cap [v] \subseteq [u] \cap [w]; as this is equivalent to [u][v][w][u] \cap [v] \subseteq [w], we are done.

Internal universal quantification

So far we have dealt just with propositional logic, internalized in terms of operations Ω nΩ\Omega^n \to \Omega within the topos E\mathbf{E}. These operations pull back to any slice E/X\mathbf{E}/X, and externalize to operations on Sub(X)Sub(X) over each individual object (or “type”) XX.

In predicate logic, quantification involves operations that change the type: if ϕ:X×AΩ\phi: X \times A \to \Omega is a predicate of type X×AX \times A, then quantification over AA produces a predicate of type XX, e.g., Aϕ:XΩ\forall_A \phi: X \to \Omega. Externally, universal quantification is a natural operation Sub(X×A)Sub(X)Sub(X \times A) \to Sub(X). Internally, universal quantification is given by a map

A:PAP1\forall_A: P A \to P 1

which classifies a subobject A *(t)PA\forall_A^\ast(t) \hookrightarrow P A. An element ϕ:1PA\phi: 1 \to P A belongs to (factors through) A *(t)\forall_A^\ast(t) if t= Aϕt = \forall_A \phi. In ordinary logic terms, this should mean ϕ(a)=t\phi(a) = t for all elements aa of AA, i.e., that ϕ\phi is constantly true over all of AA: ϕ=t A=(A1tA)\phi = t_A = (A \to 1 \stackrel{t}{\to} A). This leads us to conclude that A *(t)=t A:1PA\forall_A^\ast(t) = t_A: 1 \hookrightarrow P A, and therefore that

A=χ t A:PAP1.\forall_A = \chi_{t_A}: P A \to P 1.

The fundamental property of universal quantification, originally observed by Lawvere, is as follows.


Universal quantification is right adjoint to pulling back. At the external level: for f:XPYf: X \to P Y, let [f][f] denote the subobject of X×YX \times Y classified by ff. Then for predicates ϕ:XPA\phi: X \to P A and ψ:XP1\psi: X \to P 1, we have

A *[ψ][ϕ]iff[ψ][ Aϕ].A^\ast [\psi] \hookrightarrow [\phi] \qquad iff \qquad [\psi] \hookrightarrow [\forall_A \phi].

At this point we probably have a variety of possible proofs, but in elementary category terms we have by definition a pullback

[ Aϕ] 1 1 t A t X ϕ PA A P1\array{ [\forall_A \phi] & \to & 1 & \to & 1 \\ \downarrow & & \downarrow \mathrlap{t_A} & & \downarrow \mathrlap{t} \\ X & \underset{\phi}{\to} & P A & \underset{\forall_A}{\to} & P 1 }

and so the condition [ψ][ Aϕ][\psi] \hookrightarrow [\forall_A \phi] is equivalent to commutativity of the diagram

[ψ] 1 t A X ϕ Ω A\array{ [\psi] & \to & 1 \\ \downarrow & & \downarrow \mathrlap{t_A} \\ X & \underset{\phi}{\to} & \Omega^A }

By the adjunction ×A() A- \times A \dashv (-)^A, such a diagram transforms to a commutative diagram

[ψ]×A proj A 1 t X×A Ω\array{ [\psi] \times A & \stackrel{proj}{\to} & A & \to & 1 \\ \downarrow & & & \swarrow \mathrlap{t} & \\ X \times A & \to & \Omega }

and this is clearly equivalent to the condition A *[ψ][ϕ]A^\ast [\psi] \hookrightarrow [\phi].

The other half of Lawvere’s insight is that existential quantification is left adjoint to pulling back. We will come to this later.

One import of this insight is that quantification has a wider meaning than is found in traditional logic: whereas the traditional quantification a:Aϕ(x,a)\forall_{a: A} \phi(x, a) has to do with a right adjoint to pulling back along a projection map π 1:X×AX\pi_1: X \times A \to X (in SetSet), in principle we can consider a right adjoint to pulling back along any map f:YXf: Y \to X. One way to consider this is topos theory is that, working in a slice E/X\mathbf{E}/X and considering an object f:YXf: Y \to X ….

Internal intersection of definable families

Now we put some of the logic we have developed to use, defining an internal intersection operator

:PPAPA\bigcap: P P A \to P A

which externally takes a family FPAF \hookrightarrow P A to its intersection FA\bigcap F \hookrightarrow A.

To do this, we simply write down the appropriate logical formula for the relation RPPA×AR \hookrightarrow P P A \times A to be classified by \bigcap: it consists of points (F,a): S:PA(FS)(Sa)(F, a): \forall_{S: P A} (F \ni S) \Rightarrow (S \ni a). We remember here the precept that this formula is to be interpreted uniformly over all slices E/X\mathbf{E}/X.

Internal union of definable families

Images and existential quantification

Construction of coproducts

We construct coproducts in an elementary topos one does in naive set theory: as disjoint unions. Letting X,YX, Y be sets (objects in an ETCS category), a disjoint union of XX and YY is a pair of monos

i:XZj:YZi: X \hookrightarrow Z \qquad j: Y \hookrightarrow Z

whose intersection in Sub(Z)Sub(Z) is empty (i.e., initial), and whose union or join in Sub(Z)Sub(Z) is all of ZZ. We will show that disjoint unions exist and are essentially unique, and that they satisfy the universal property for coproducts. We use the notation X+YX + Y for a disjoint union.

Construction of initial object

We first construct an initial object 00 by taking the intersection of all subobjects of 11, i.e., as the subobject 010 \hookrightarrow 1 classified by the composite

1t P1PP1P1.1 \stackrel{t_{P 1}}{\to} P P 1 \stackrel{\bigcap}{\to} P 1.

00 is the minimal subobject of 11.


Let U1U \subseteq 1 be any subobject, classified by u=χ U:1P1u = \chi_U: 1 \to P 1. The singleton σu:1PP1\sigma u: 1 \to P P 1 satisfies σut P1\sigma u \leq t_{P 1} since t P1t_{P 1} is the internal maximum, and so χ 0=t P1σu=u\chi_0 = \bigcap t_{P 1} \leq \bigcap \sigma u = u.


00 is initial.


If XX is any object, then for arrows f,g:0Xf, g: 0 \rightrightarrows X the equalizer E0E \to 0 is a subobject of 00 and therefore equals 00 as a minimal subobject of 11, so f=gf = g. This proves uniqueness of arrows 0X0 \to X. For existence, the pullback of the mono σ X:XPX\sigma_X: X \to P X in the diagram

X σ X 0 1 t X PX\array{ & & & & X \\ & & & & \downarrow \mathrlap{\sigma_X} \\ 0 & \to & 1 & \stackrel{t_X}{\to} & P X }

gives a subobject of 00 which must be 00 again, so there is an arrow 0X0 \to X from the pullback, and we are done.


0×X00 \times X \cong 0.


The functor ×X- \times X is a left adjoint by cartesian closure, and hence preserves initial objects.


The initial object 00 is strict, i.e., any arrow f:X0f: X \to 0 must be an isomorphism. (Therefore, for any object YY, the arrow 0Y0 \to Y is vacuously monic, and gives the minimum element of Sub(Y)Sub(Y).)


Given f:X0f: X \to 0, the arrow f,1 X:X0×X\langle f, 1_X \rangle: X \to 0 \times X has a retraction given by projection π:0×XX\pi: 0 \times X \to X. So XX is a retract of 0×X0 \times X which is initial, making XX initial itself.

Incidentally, we may now define the negation operator:


Let χ 0:1Ω\bot \coloneqq \chi_0: 1 \to \Omega be the classifying map of 010 \hookrightarrow 1. Then negation is defined to be ¬χ :ΩΩ\neg \coloneqq \chi_\bot: \Omega \to \Omega. Alternatively, ¬\neg may be defined to be ()(-) \Rightarrow \bot.

Construction of disjoint unions


For any two objects X,YX, Y, a disjoint union of XX and YY exists.


It’s enough to embed X,YX, Y disjointly into some CC, since the union of the two monos in Sub(C)Sub(C) would then be the requisite ZZ. The idea now is that if a disjoint union or coproduct X+YX+Y exists, then there’s a canonical isomorphism P(X+Y)PX×PYP(X + Y) \cong P X \times P Y. Since the singleton map

σ:X+YP(X+Y)PX×PY\sigma : X + Y \to P(X + Y) \cong P X \times P Y

is monic, one thus expects to be able to embed XX and YY disjointly into PX×PYP X \times P Y. Since we can easily work out how all this goes in naive structural set theory, we just write out the formulas and check it works.

In detail, define i X:XPX×PYi_X : X \to P X \times P Y to be

XX×1σ X×χ 0PX×PYX \cong X \times 1 \stackrel{\sigma_X \times \chi_0}{\to} P X \times P Y

where σ X\sigma_X is the singleton mapping and χ 0\chi_0 classifies 0Y0 \hookrightarrow Y; similarly, define i Y:YPX×PYi_Y :Y \to P X \times P Y to be

Y1×Yχ 0×σ YPX×PY.Y \cong 1 \times Y \stackrel{\chi_0 \times \sigma_Y}{\to} P X \times P Y.

Clearly i Xi_X and i Yi_Y are monic, so to show disjointness we just have to show that their pullback is initial. But their pullback is isomorphic to the cartesian product of the pullbacks of the diagrams

Xσ XPXχ 01,1χ 0PYσ YYX \stackrel{\sigma_X}{\to} P X \stackrel{\chi_0}{\leftarrow} 1, \qquad 1 \stackrel{\chi_0}{\to} P Y \stackrel{\sigma_Y}{\leftarrow} Y

so it would be enough to show that each (or just one) of these pullbacks is initial, say the first. Suppose given a map h:AXh: A \to X which makes the square

A 1 h χ 0 X σ X PX\array{ A & \to & 1 \\ \mathllap{h} \downarrow & & \downarrow \mathrlap{\chi_0} \\ X & \underset{\sigma_X}{\to} & P X }

commute. The composite A1χ 0PXA \to 1 \stackrel{\chi_0}{\to} P X classifies

A×0A×XA \times 0 \hookrightarrow A \times X

which is initial. This must be the same subobject as classified by σ Xh=χ δh\sigma_X h = \chi_\delta h (where δ:XX×X\delta: X \to X \times X is the diagonal), which is

(h×1 X) *(δ)A×X.(h \times 1_X)^\ast(\delta) \hookrightarrow A \times X.

An easy calculation shows this subobject to be the equalizer of the pair of maps

π 2,hπ 1:A×XX.\pi_2, h\pi_1: A \times X \rightrightarrows X.

So this equalizer EE is initial. But notice 1,h:AA×X\langle 1, h \rangle: A \to A \times X equalizes this pair of maps. Therefore we have a map AE0A \to E \cong 0; by Corollary , we infer A0A \cong 0. This applies to the case where AA is the pullback, so the pullback is initial, as was to be shown.

Disjoint unions are coproducts


Any two disjoint unions of XX and YY are canonically isomorphic.


Let us consider the disjoint union of XX and YY constructed in Theorem as the “standard” disjoint union, which we will denote as X+YX + Y. Suppose ZZ is another disjoint union of XX and YY, i.e., suppose i:XZi: X \to Z and j:YZj: Y \to Z are two disjoint embeddings such that their union in Sub(Z)Sub(Z) is all of ZZ. We construct a comparison isomorphism X+YZX + Y \cong Z; we begin by defining a map ϕ=ϕ 1,ϕ 2:ZPX×PY\phi = \langle \phi_1, \phi_2 \rangle: Z \to P X \times P Y, where ϕ 1:ZPX\phi_1: Z \to P X classifies i,1 X:XZ×X\langle i, 1_X \rangle: X \hookrightarrow Z \times X, and ϕ 2:ZPY\phi_2: Z \to P Y classifies j,1 Y:YZ×Y\langle j, 1_Y \rangle: Y \hookrightarrow Z \times Y. It is not hard to see that ϕi:XPX×PY\phi i: X \to P X \times P Y and ϕj:YPY\phi j: Y \to P Y coincide with the embeddings i X:XPXi_X: X \to P X and i Y:YPYi_Y: Y \to P Y constructed in Theorem , using the fact that i,ji, j are disjoint embeddings.

Our first task is to show ϕ\phi is monic. If we introduce a pullback

U p 1 Z p 2 ϕ=ϕ 1,ϕ 2 Z ϕ PX×PY\array{ U & \stackrel{p_1}{\to} & Z \\ \mathllap{p_2} \downarrow & & \downarrow \mathrlap{\phi = \langle \phi_1, \phi_2 \rangle} \\ Z & \underset{\phi}{\to} & P X \times P Y }

then it suffices to show the pullback projections are equal: p 1=p 2p_1 = p_2. Or equivalently, that if k:EUk: E \hookrightarrow U is the equalizer of p 1,p 2:UZp_1, p_2: U \rightrightarrows Z, then the subobject EE of UU is all of UU.

From ϕp 1=ϕp 2\phi p_1 = \phi p_2 we have ϕ 1p 1=ϕ 1p 2\phi_1 p_1 = \phi_1 p_2 and ϕ 2p 1=ϕ 2p 2\phi_2 p_1 = \phi_2 p_2. So for example, from the first equation we have (p 1×1 X) *i,1 X=(p 2×1 X) *i,1 X(p_1 \times 1_X)^\ast \langle i, 1_X \rangle = (p_2 \times 1_X)^\ast \langle i, 1_X \rangle as subobjects of U×XU \times X, say f,g:WU×X\langle f, g \rangle: W \to U \times X as in a pullback

W g X X f,g pb i,1 U×X p 1×1 Z×X ϕ 1×1 PX×X.\array{ W & \stackrel{g}{\to} & X & \to & \ni_X \\ \mathllap{\langle f, g \rangle} \downarrow & pb & \downarrow \mathrlap{\langle i, 1 \rangle} & & \downarrow \\ U \times X & \underset{p_1 \times 1}{\to} & Z \times X & \underset{\phi_1 \times 1}{\to} & P X \times X. }

Because those subobjects are the same f,g:WU×X\langle f, g \rangle: W \hookrightarrow U \times X, we deduce that p 1f=ig=p 2fp_1 f = i g = p_2 f. Composing the square marked ‘pb’ with the pullback

U×X p 1×1 Z×X π U π Z U p 1 Z\array{ U \times X & \stackrel{p_1 \times 1}{\to} & Z \times X \\ \mathllap{\pi_U} \downarrow & & \downarrow \mathrlap{\pi_Z} \\ U & \underset{p_1}{\to} & Z }

we have a pullback

W=p 1 *(X) X f i U p 1 Z\array{ W = p_1^\ast (X) & \to & X \\ \mathllap{f} \downarrow & & \downarrow \mathrlap{i} \\ U & \underset{p_1}{\to} & Z }

and from p 1j=p 2jp_1 j = p_2 j we deduce that p 1 *(X)Up_1^\ast(X) \hookrightarrow U factors through the equalizer k:EUk: E \hookrightarrow U. Similarly, arguing as above but replacing ϕ 1\phi_1 by ϕ 2\phi_2, we see p 1 *(Y)Up_1^\ast(Y) \hookrightarrow U factors through EUE \hookrightarrow U. So the union p 1 *(X)p 1 *(Y)p_1^\ast(X) \cup p_1^\ast(Y) is contained in the subobject EE of UU. However, the pullback function p 1 *:Sub(Z)Sub(U)p_1^\ast: Sub(Z) \to Sub(U) is a lattice map (it preserves the bottom element and finite unions because it has a right adjoint induced by Π p 1\Pi_{p_1}; cf. local cartesian closure), and so we have

U=p 1 *(Z)=p 1 *(XY)=p 1 *(X)p 1 *(Y)EU = p_1^\ast(Z) = p_1^\ast(X \cup Y) = p_1^\ast(X) \cup p_1^\ast(Y) \leq E

and hence E=UE = U as subobjects of UU, forcing p 1=p 2p_1 = p_2 as claimed, and so ϕ\phi defines a subobject ZPX×PYZ \hookrightarrow P X \times P Y.

The argument above shows that ϕ:ZPX×PY\phi : Z \hookrightarrow P X \times P Y is an upper bound of i X:XPX×PYi_X : X \to P X \times P Y and i Y:YPX×PYi_Y : Y \to P X \times P Y in Sub(PX×PY)Sub(P X \times P Y). It follows that the join X+YX + Y constructed in Theorem is contained in ϕ:ZPX×PY\phi : Z \hookrightarrow P X \times P Y, and hence can be regarded as the join of XX and YY in Sub(Z)Sub(Z). But ZZ is their join in Sub(Z)Sub(Z) by assumption of being a disjoint union, so the inclusion X+YZX + Y \leq Z must be an equality. The proof is now complete.


The inclusions i X:XX+Yi_X : X \to X + Y, i Y:YX+Yi_Y : Y \to X + Y exhibit X+YX + Y as the coproduct of XX and YY.


Let f:XBf : X \to B, g:YBg : Y \to B be given functions. Then we have monos

1 X,f:XX×B1 Y,g:YY×B(1)\langle 1_X, f \rangle : X \hookrightarrow X \times B \qquad \langle 1_Y, g \rangle : Y \hookrightarrow Y \times B \qquad (1)

Now the operation ×B:Sub(C)Sub(C×B)- \times B : Sub(C) \to Sub(C \times B) certainly preserves finite meets, and also preserves finite joins because it is left adjoint to B:Sub(C×B)Sub(C)\forall_B : Sub(C \times B) \to Sub(C). Therefore this operation preserves disjoint unions; we infer that the monos

X×Bi X×B(X+Y)×BY×Bi Y×B(X+Y)×B(2)X \times B \stackrel{i_X \times B}{\to} (X + Y) \times B \qquad Y \times B \stackrel{i_Y \times B}{\to} (X + Y) \times B \qquad (2)

exhibit (X+Y)×B(X + Y) \times B as a disjoint union of X×B,Y×BX \times B, Y \times B. Composing the monos of (1) and (2), we have disjoint embeddings of XX and YY in (X+Y)×B(X + Y) \times B. Using Theorem , X+YX + Y is isomorphic to the join of these embeddings; this means we have an inclusion

1,h:X+Y(X+Y)×B\langle 1, h \rangle : X + Y \hookrightarrow (X + Y) \times B

whose restriction to XX yields i X,f\langle i_X, f \rangle and whose restriction to YY yields i Y,g\langle i_Y, g \rangle. Hence h:X+YBh : X + Y \to B extends ff and gg. It is the unique extension, for if there were two extensions h,hh, h', then their equalizer would be an upper bound of X,YX, Y in Sub((X+Y)×B)Sub((X + Y) \times B), contradicting the fact that X+YX + Y is the least upper bound. This completes the proof.


A topos is an extensive category.


Coproducts are disjoint by construction and are stable under pullback since pullback functors f *:E/BE/Af^\ast: \mathbf{E}/B \to \mathbf{E}/A preserve colimits by the remarks after Proposition .

Construction of coequalizers

In this section we prove that a topos admits coequalizers; in conjunction with Theorem , this shows that toposes are finitely cocomplete.

But the real motif of this section is the Barr exactness of a topos, which is effectively summarized by a Galois correspondence between equivalence relations on an object YY and regular epimorphisms out of YY, stable under pullback. We begin with a few preliminary observations, valid for any category E\mathbf{E}.

For an object YY, let Rel YRel_Y be the class of subobjects of p 1,p 2:AY×Y\langle p_1, p_2\rangle: A \hookrightarrow Y \times Y. As usual, this class is the partial order reflection of the preorder (thin category) given by the full subcategory of E/Y×Y\mathbf{E}/Y \times Y whose objects consist of monomorphisms into Y×YY \times Y.

Let Quot YQuot_Y be the class of quotients = co-subobjects of YY, that is to say equivalence classes of epis h:YZh: Y \twoheadrightarrow Z. This class is the partial order reflection of the preorder given by the full subcategory of the co-slice Y/EY/\mathbf{E} whose objects consist of epimorphisms out of YY (so (p:YP)(q:YQ)(p: Y \twoheadrightarrow P) \leq (q: Y \twoheadrightarrow Q) if there is a map f:PQf: P \to Q such that fp=qf \circ p = q; such an ff is unique and necessarily an epi).

There is a fundamental relation between Rel YRel_Y and Quot YQuot_Y, which we denote as \perp, where by definition p 1,p 2h\langle p_1, p_2\rangle \perp h iff hp 1=hp 2h p_1 = h p_2. For hQuot Yh \in Quot_Y, let h h^\wedge denote the class of RRel YR \in Rel_Y such that RhR \perp h. A kernel pair of hh, if it exists (as it certainly will in any category with pullbacks, for example a topos), is the maximal element of h h^\wedge. It is denoted ker(h)\ker(h).

Similarly, for R=p 1,p 2Rel YR = \langle p_1, p_2 \rangle \in Rel_Y, let R R^\vee denote the class of hQuot Yh \in Quot_Y such that RhR \perp h. A coequalizer of the pair (p 1,p 2)(p_1, p_2), if it exists, is the minimal element of R R^\vee. It is denoted coeq(p 1,p 2)coeq(p_1, p_2) or quot(R)quot(R).

If kernel pairs and coequalizers exist, then we get a monotone Galois connection between Rel YRel_Y and Quot YQuot_Y, where quot:Rel YQuot Yquot: Rel_Y \to Quot_Y is left adjoint to ker:Quot YRel Y\ker: Quot_Y \to Rel_Y. That is to say, quotp 1,p 2hquot \langle p_1, p_2 \rangle \leq h iff p 1,p 2ker(h)\langle p_1, p_2 \rangle \leq \ker(h), since both sides are equivalent to hp 1=hp 2h p_1 = h p_2.

Reduction to quotients

The goal of this section is to show that in order to prove that a topos has coequalizers, it suffices to construct quotients of equivalence relations.


Let f:XYf: X \to Y be a map with image im(f)Yim(f) \hookrightarrow Y. Then the map q:Xim(f)q: X \to im(f) through which ff factors is epic.


Suppose given g,h:im(f)Zg, h: im(f) \to Z such that gq=hqg q = h q. Then qq (and thus also ff) factors through their equalizer, a subobject of im(f)im(f). Since im(f)im(f) is the smallest subobject through which ff factors, the equalizer must be all of im(f)im(f), and therefore g=hg = h.

In combination with Corollary , we derive the following consequence.


Every morphism f:XYf: X \to Y has a unique epi-(regular) mono factorization, given by the image factorization Xim(f)YX \to im(f) \hookrightarrow Y.


For any parallel pair f,g:XYf, g: X \rightrightarrows Y, let p 1,p 2:RY×Y\langle p_1, p_2 \rangle: R \hookrightarrow Y \times Y be the image of f,g:XY×Y\langle f, g \rangle: X \to Y \times Y. Then for any h:YZh: Y \to Z, we have hf=hgh f = h g iff hp 1=hp 2h p_1 = h p_2.


The map f,g\langle f, g \rangle factors as an epi π:XR\pi: X \to R followed by the mono p 1,p 2\langle p_1, p_2 \rangle. Thus f=p 1πf = p_1 \pi and g=p 2πg = p_2 \pi, so obviously hp 1=hp 2h p_1 = h p_2 implies, hitting both sides with π\pi, that hf=hgh f = h g. Conversely, hf=hgh f = h g implies hp 1π=hp 2πh p_1 \pi = h p_2 \pi which implies hp 1=hp 2h p_1 = h p_2 since π\pi is epic.


The coequalizer of the pair (f,g)(f, g) is the coequalizer of its image pair (p 1,p 2)(p_1, p_2), provided that either coequalizer exists.


The coequalizer of either would be the minimal element in the class of those hQuot Yh \in Quot_Y such that hf=hgh f = h g, equivalently hp 1=hp 2h p_1 = h p_2.

The next proposition provides a kind of dual form of the preceding proposition and its corollary.


If f:YWf: Y \to W has epi-mono factorization YqQiWY \stackrel{q}{\twoheadrightarrow} Q \stackrel{i}{\rightarrowtail} W, then ker(f)=ker(q)\ker(f) = \ker(q).


For R=p 1,p 2Rel YR = \langle p_1, p_2 \rangle \in Rel_Y, we have qp 1=qp 2q p_1 = q p_2 iff iqp 1=iqp 2i q p_1 = i q p_2 since ii is monic, so qp 1=qp 2q p_1 = q p_2 iff fp 1=fp 2f p_1 = f p_2. The kernel pair of either qq or ff would be the maximal such relation RR.


For a relation R=p 1,p 2Rel YR = \langle p_1, p_2 \rangle \in Rel_Y, let Eq(R)=π 1,π 2Eq(R) = \langle \pi_1, \pi_2 \rangle be the minimal equivalence relation containing RR, i.e., the intersection of all equivalence relations containing RR. Then the coequalizer of (p 1,p 2)(p_1, p_2) equals the coequalizer of (π 1,π 2)(\pi_1, \pi_2), assuming either exists.


Clearly, from REq(R)R \leq Eq(R) we have that hπ 1=hπ 2h\pi_1 = h\pi_2 implies hp 1=hp 2h p_1 = h p_2. In the other direction, from hp 1=hp 2h p_1 = h p_2 we have that ker(h)ker(h) is an equivalence relation containing RR; thus Eker(h)E \leq ker(h) since EE is the smallest equivalence relation containing RR. This immediately implies that hπ 1=hπ 2h \pi_1 = h \pi_2.

Therefore p 1,p 2 =π 1,π 2 \langle p_1, p_2 \rangle^\vee = \langle \pi_1, \pi_2 \rangle^\vee. A coequalizer would be the minimal element of either class.

This proposition shows that to construct coequalizers in a topos, it suffices to construct quotients of equivalence relations.

Equivalence relations are effective


If p 1,p 2:EY×Y\langle p_1, p_2 \rangle: E \to Y \times Y is an equivalence relation, with classifying map χ E:YPY\chi_E: Y \to P Y, then χ Ep 1=χ Ep 2\chi_E \circ p_1 = \chi_E \circ p_2.


This is equivalent to the statement that (p 1×1) *E=(p 2×1) *E(p_1 \times 1)^\ast E = (p_2 \times 1)^\ast E as subobjects of E×YE \times Y, as in the pullback

(p 1×1) *E E i E×Y p 1×1 Y×Y.\array{ (p_1 \times 1)^\ast E & \to & E \\ \downarrow & & \downarrow \mathrlap{i} \\ E \times Y & \underset{p_1 \times 1}{\to} & Y \times Y. }

Since this is a statement in finite limit logic, it can be proven conceptually by a Yoneda lemma argument where it suffices to check it in the case E=Set\mathbf{E} = Set, where (p 1×1) *E=(p 2×1) *E(p_1 \times 1)^\ast E = (p_2 \times 1)^\ast E boils down to checking

{(v,w,y)Y 3:(v,w)E(v,y)E}={(v,w,y)Y 3:(v,w)E(w,y)E}\{(v, w, y) \in Y^3: (v, w) \in E \wedge (v, y) \in E\} = \{(v, w, y) \in Y^3: (v, w) \in E \wedge (w, y) \in E\}

which of course follows easily from the axioms for equivalence relations.

If this isn’t already convincing, then the formal proof may be extracted just by chasing through the Yoneda lemma. In slightly greater detail: P 1(p 1×1) *EP_1 \coloneqq (p_1 \times1)^\ast E as a subobject of Y×Y×YY \times Y \times Y is the limit of the diagram

E Y×Y×Y E i π 13 π 12 i Y×Y Y×Y \array{ E & & & & Y \times Y \times Y & & & & E \\ & \mathllap{i} \searrow & & \mathllap{\pi_{13}} \swarrow & & \searrow \mathrlap{\pi_{12}} & & \swarrow \mathrlap{i} & \\ & & Y \times Y & & & & Y \times Y & & }

where we may use v,w,yv, w, y as generic symbols to label the limit cone component v,w,y:P 1Y×Y×Y\langle v, w, y \rangle: P_1 \to Y \times Y \times Y (the other components are then uniquely determined, by monicity of ii). The limit condition simply means v,w:P 1Y×Y\langle v, w \rangle: P_1 \to Y \times Y and v,y:P 1Y×Y\langle v, y \rangle: P_1 \to Y \times Y both factor through i:EY×Yi: E \hookrightarrow Y \times Y; we may abbreviate these conditions as “(v,w)E(v, w) \in E and (v,y)E(v, y) \in E”. Similarly, P 2(p 2×1) *EP_2 \coloneqq (p_2 \times 1)^\ast E as a subobject of Y×Y×YY \times Y \times Y is defined similarly as a limit just by replacing π 13\pi_{13} in the diagram above by π 23\pi_{23}; the corresponding limit condition abbreviates to “(v,w)E(v, w) \in E and (w,y)E(w, y) \in E”, and it becomes clear how the desired statement follows from the definition of internal equivalence relation (partial equivalence relations, dropping the reflexivity axiom, would be enough).

By this proposition, we have that Eker(χ E)E \leq ker(\chi_E) for equivalence relations EE.


Equivalence relations in a topos are effective.


The claim is that an equivalence relation i:EY×Yi: E \hookrightarrow Y \times Y coincides with the kernel pair of its classifying map. We have already seen from Proposition that Eker(χ E)E \leq \ker(\chi_E) in Sub(Y×Y)Sub(Y \times Y).

To get ker(χ E)E\ker(\chi_E) \leq E, we may again appeal to the heuristic of arguing in finite limit logic where it suffices to check the claim for E=Set\mathbf{E} = Set. The claim is that if χ E(u)=χ E(v)\chi_E(u) = \chi_E(v) for u,vu, v in YY, then (u,v)E(u, v) \in E. The hypothesis χ E(u)=χ(v)\chi_E(u) = \chi(v) says

{yY:(u,y)E}={yY:(v,y)E}\{y \in Y: (u, y) \in E\} = \{y \in Y: (v, y) \in E\}

where by reflexivity, uu is a member of the right side, hence of the left side: (v,u)E(v, u) \in E. But then by symmetry, (u,v)E(u, v) \in E.

Again we may chase through the Yoneda lemma to extract a formal elementary proof. Let u,vu, v be the generic such pair, i.e., let u,v:PY×Y\langle u, v \rangle: P \hookrightarrow Y \times Y denote the kernel pair of χ E\chi_E as in the pullback

P u Y v χ E Y χ E PY\array{ P & \stackrel{u}{\to} & Y \\ \mathllap{v} \downarrow & & \downarrow \mathrlap{\chi_E} \\ Y & \underset{\chi_E}{\to} & P Y }

so that by commutativity of the diagram we have (u×1 Y) *E=(v×1) *E(u \times 1_Y)^\ast E = (v \times 1)^\ast E as in the pullback

Q E i P×Y u×1 Y×Y\array{ Q & \to & E \\ \downarrow & & \downarrow \mathrlap{i} \\ P \times Y & \underset{u \times 1}{\to} & Y \times Y }

Then u,u=(u×1)1,u:PY×Y\langle u, u \rangle = (u \times 1) \circ \langle 1, u \rangle: P \to Y \times Y factors through ii by reflexivity. Thus 1,u:PP×Y\langle 1, u \rangle: P \to P \times Y induces a map P(u×1) *E=(v×1) *EP \to (u \times 1)^\ast E = (v \times 1)^\ast E, meaning (v×1)1,u=v,u(v \times 1)\circ \langle 1, u \rangle = \langle v, u \rangle factors through i:EY×Yi: E \to Y \times Y. By symmetry, u,v:PY×Y\langle u, v \rangle: P \to Y \times Y factors through i:EY×Yi: E \to Y \times Y, i.e., ker(χ E)E\ker(\chi_E) \leq E.

Quotients of equivalence relations

In this subsection we construct quotients of equivalence relations.


Let f:YWf: Y \to W be a map, and let f *f^\ast denote the opposite relation f,1 Y:YW×Y\langle f, 1_Y \rangle: Y \hookrightarrow W \times Y, with characteristic map χ f *:WPY\chi_{f^\ast}: W \to P Y. Then the composite

YfWχ f *PYY \stackrel{f}{\to} W \stackrel{\chi_{f^\ast}}{\to} P Y

is the characteristic map of ker(f)Y×Yker(f) \hookrightarrow Y \times Y.


This follows immediately on examination of the composite pullback diagram

ker(f) Y Y f,1 Y Y×Y f×1 Y W×Y χ f *×1 Y PY×Y\array{ ker(f) & \to & Y & \to & \ni_Y \\ \downarrow & & \downarrow_\mathrlap{\langle f, 1_Y \rangle} & & \downarrow \\ Y \times Y & \underset{f \times 1_Y}{\to} & W \times Y & \underset{\chi_{f^\ast} \times 1_Y}{\to} & P Y \times Y }

If q:YQq: Y \to Q is epic, then χ q *:QPY\chi_{q^\ast}: Q \to P Y is monic.


Given f,ghom(A,Q)f, g \in \hom(A, Q), if we suppose that χ q *f=χ q *g\chi_{q^\ast} \circ f = \chi_{q^\ast} \circ g, then pulling back the subobject q,1 Y:YQ×Y\langle q, 1_Y \rangle: Y \to Q \times Y along either ff or qq results in the same subobject of A×YA \times Y, say

P p Y r,p q,1 Y A×Y f×1 Y Q×Y\array{ P & \stackrel{p}{\to} & Y \\ _\mathllap{\langle r, p \rangle} \downarrow & & \downarrow_\mathrlap{\langle q, 1_Y \rangle} \\ A \times Y & \underset{f \times 1_Y}{\to} & Q \times Y }

where the map r:PAr: P \to A is the pullback of q:YQq: Y \to Q along ff. Since qq is epic, so is rr (this uses the fact that pulling back is a left adjoint; cf. local cartesian closure). But now we see fr=qp=grf \circ r = q \circ p = g \circ r, which implies f=gf = g since rr is epic.


If p 1,p 2:EY×Y\langle p_1, p_2 \rangle: E \to Y \times Y is an equivalence relation, then the epimorphism q:Yim(χ E)q: Y \to im(\chi_E) in the image factorization of χ E:YPY\chi_E: Y \to P Y is the coequalizer of the pair (p 1,p 2)(p_1, p_2).


Put Q=im(χ E)Q = im(\chi_E), i.e., let YqQiP(Y)Y \stackrel{q}{\to} Q \stackrel{i}{\rightarrowtail} P(Y) be the unique epi-mono factorization of χ E\chi_E. Suppose h:YZh: Y \to Z satisfies hp 1=hp 2h p_1 = h p_2. We have E=ker(χ E)E = \ker(\chi_E) by Theorem (effectiveness of equivalence relations), and ker(χ E)=ker(q)\ker(\chi_E) = \ker(q) by Proposition . Therefore, for generalized elements y,y:UYy, y': U \to Y we have qy=qyq y = q y' iff y,y\langle y, y' \rangle factors through i:EY×Yi: E \hookrightarrow Y \times Y, which implies q,hy=q,hy\langle q, h \rangle y = \langle q, h \rangle y'. On the other hand, q,hy=q,hy\langle q, h \rangle y = \langle q, h \rangle y' trivially implies qy=qyq y = q y'. Thus E=ker(χ E)=ker(q)=ker(q,h)E = \ker(\chi_E) = \ker(q) = \ker(\langle q, h\rangle).

So, if f=q,h:YQ×Zf = \langle q, h \rangle: Y \to Q \times Z has epi-mono factorization YqQiQ×ZY \stackrel{q'}{\twoheadrightarrow} Q' \stackrel{i'}{\rightarrowtail} Q \times Z, then ker(f)=ker(q)ker(f) = \ker(q') (Proposition again). Now χ (q) *:QP(Y)\chi_{(q')^\ast}: Q' \to P(Y) is monic by Lemma , and χ E=χ ker(f)=χ ker(q)\chi_E = \chi_{\ker(f)} = \chi_{\ker(q')} has epi-mono factorization YqQχ (q) *PYY \stackrel{q'}{\twoheadrightarrow} Q' \stackrel{\chi_{(q')^\ast}}{\rightarrowtail} P Y by Lemma . By uniqueness of epi-mono factorizations, we see now that QQ equals QQ' as quotients of YY, and the map i:QQ×Zi': Q \to Q \times Z must be of the form 1 Q,r:QQ×Z\langle 1_Q, r \rangle: Q \to Q \times Z. The map r:QZr: Q \to Z is then the required mediator through which hh factors, i.e., h=rqh = r \circ q. There is of course just one such map because qq is epic.


A topos is a regular category. (It is therefore also Barr-exact, since by Theorem all equivalence relations are effective.)


It is more than enough to observe that a topos is finitely complete and has coequalizers, and pullback functors preserve coequalizers (by local cartesian closure). A coequalizer of a pair of maps f,g:XYf, g: X \rightrightarrows Y is constructed as quot(Eq(imf,g))quot(Eq(im \langle f, g \rangle)), using Corollary and Proposition and Theorem .


A topos is a Heyting pretopos.


As we have seen, a topos is an extensive Barr-exact category whose subobject lattices carry Heyting algebra structures that are preserved by pullback functors.

It follows from the general theory of pretoposes that all epimorphisms in a topos are regular. However, we can also prove this easily from the theory developed above.


Every epimorphism q:ABq: A \to B in a topos is a regular epimorphism, indeed an effective quotient.


Theorem together with Lemma and Lemma shows that qq is the coequalizer of its kernel pair ker(q):EA\ker(q): E \rightrightarrows A.



  1. Cf. the ridicule of so-called ‘categorical dys-foundations’ by various logicians at FOM (Foundations of Mathematics list, esp. circa 1997-1998; see e.g. here). This is a pity, since the developments surrounding topos theory have contributed genuine insights into the algebra of logic and set theory.

  2. Colimits in the basic examples of topos theory, such as Grothendieck toposes, are of course usually much easier to grasp.

  3. Developing the internal logic of a topos – which is not at all trivial! – provides an excellent way to appreciate what first-order logic is about from the standpoint of categorical algebra, being “characterized axiomatically by specific interlocking systems of adjoint functors” (Lawvere).

  4. None of this may be new to the real experts; in fact I think there’s a good chance that the development pursued here has occurred to them. Still, a published account of it is nonexistent, as far as I know.

  5. This is reminiscent of the “stack semantics” of a topos, pioneered by Mike Shulman.

  6. ‘Heuristic’ in the original sense of “discovery method” (faithful to its Greek etymology), i.e., a general method which guides the user to discover solutions.

Revised on September 23, 2019 at 12:02:30 by Todd Trimble