nLab Trimble on ETCS III

ETCS : Building joins and coproducts




The basis of it all

 Set theory

set theory

Foundational axioms

foundational axioms

Removing axioms

This is Part III of an exposition by Todd Trimble on ETCS.

See also

ETCS : Building joins and coproducts

After a long hiatus, I’d like to renew the discussion of axiomatic categorical set theory, more specifically the Elementary Theory of the Category of Sets (ETCS). Last time I blogged about this, I made some initial forays into “internalizing logic” in ETCS, and described in broad brushstrokes how to use that internal logic to derive a certain amount of the structure one associates with a category of sets. Today I’d like to begin applying some of the results obtained there to the problem of constructing colimits in a category satisfying the ETCS axioms (an ETCS category, for short).

(If you’re just joining us now, and you already know some of the jargon, an ETCS category is a well-pointed topos that satisfies the axiom of choice and with a natural numbers object. We are trying to build up some of the elementary theory of such categories from scratch, with a view toward foundations of mathematics.)

But let’s see — where were we? Since it’s been a while, I was tempted to review the philosophy behind this undertaking (why one would go to all the trouble of setting up a categories-based alternative to ZFC, when time-tested ZFC is able to express virtually all of present-day mathematics on the basis of a reasonably short list of axioms?). But in the interest of time and space, I’ll confine myself to a few remarks.

As we said, a chief difference between ZFC and ETCS resides in how ETCS treats the issue of membership. In ZFC, membership is a global binary relation: we can take any two “sets” A,BA, B and ask whether ABA \in B. Whereas in ETCS, membership is a relation between entities of different sorts: we have “sets” on one side and “elements” on another, and the two are not mixed ( e.g., elements are not themselves considered sets).

Further, and far more radical: in ETCS the membership relation xAx \in A is a function, that is, an element xx “belongs” to only one set AA at a time. We can think of this as “declaring” how we are thinking of an element, that is, declaring which set (or which type) an element is being considered as belonging to. (In the jargon, ETCS is a typed theory.) This reflects a general and useful philosophic principle: that elements in isolation are considered inessential, that what counts are the aggregates or contexts in which elements are organized and interrelated. For instance, the numeral “2” in isolation has no meaning; what counts is the context in which we think of it ( qua rational number or qua complex number, etc.). Similarly the set of real numbers has no real sense in isolation; what counts is which category we view it in.

I believe it is reasonable to grant this principle a foundational status, but: rigorous adherence to this principle completely changes the face of what set theory looks like. If elements “belong” only to one set at a time, how then do we even define such basic concepts as subsets and intersections? These are some of the issues we discussed last time.

There are other significant differences between ZFC and ETCS: stylistically, or in terms of presentation, ZFC is more “top-down” and ETCS is more “bottom-up”. For example, in ZFC, one can pretty much define a subset {xX:P}\{ x \in X : P \} by writing down a first-order formula PP in the language; the comprehension (or separation) axiom scheme is a mighty sledgehammer that takes care of the rest. In the axioms of ETCS, there is no such sledgehammer: the closest thing one has to a comprehension scheme in the ETCS axioms is the power set axiom (a single axiom, not an axiom scheme). However, in the formal development of ETCS, one derives a comprehension scheme as one manually constructs the internal logic, in stages, using the simple tools of adjunctions and universal properties. We started doing some of that in our last post. So: with ZFC it’s more as if you can just hop in the car and go; with ETCS you build the car engine from smaller parts with your bare hands, but in the process you become an expert mechanic, and are not so rigidly attached to a particular make and model ( e.g., much of the theory is built just on the axioms of a topos, which allows a lot more semantic leeway than one has with ZF).

But, in all fairness, that is perhaps the biggest obstacle to learning ETCS: at the outset, the tools available [mainly, the idea of a universal property] are quite simple but parsimonious, and one has to learn how to build some set-theoretic and logical concepts normally taken as “obvious” from the ground up. (Talk about “foundations”!) On the plus side, by building big logical machines from scratch, one gains a great deal of insight into the inner workings of logic, with a corresponding gain in precision and control and modularity when one would like to use these developments to design, say, automated deduction systems (where there tend to be strong advantages to using type-theoretic frameworks).

Enough philosophy for now; readers may refer to my earlier posts for more. Let’s get to work, shall we? Our last post was about the structure of (and relationships between) posets of subobjects Sub(X)Sub(X) relative to objects XX, and now we want to exploit the results there to build some absolute constructions, in particular finite coproducts and coequalizers. In this post we will focus on coproducts.

Note to the experts. Most textbook treatments of the formal development of topos theory (as for example Mac Lane–Moerdijk) are efficient but highly technical, involving for instance the slice theorem for toposes and, in the construction of colimits, recourse to Beck’s theorem in monad theory applied to the double power-set monad [following the elegant construction of Par]. The very abstract nature of this style of argumentation (which in the application of Beck’s theorem expresses ideas of fourth-order set theory and higher) is no doubt partly responsible for the somewhat fearsome reputation of topos theory.

In these notes I take a much less efficient but much more elementary approach, based on an arrangement of ideas which I hope can be seen as “natural” from the point of view of naive set theory. I learned of this approach from Myles Tierney, who was my PhD supervisor, and who with Bill Lawvere co-founded elementary topos theory, but I am not aware of any place where the details of this approach have been written up before now. I should also mention that the approach taken here is not as “purist” as many topos theorists might want; for example, here and there I take advantage of the strong extensionality axiom of ETCS to simplify some arguments.

The Empty Set and Two-Valued Logic

We begin with the easy observation that a terminal category, i.e., a category 1\mathbf{1} with just one object and one morphism (the identity), satisfies all the ETCS axioms. Ditto for any category CC equivalent to 1\mathbf{1} (where every object is terminal). Such boring ETCS categories are called degenerate; obviously our interest is in the structure of nondegenerate ETCS categories.

Let E\mathbf{E} be an ETCS category (see here for the ETCS axioms). Objects of E\mathbf{E} are generally called “sets”, and morphisms are generally called “functions” or “maps”.

Proposition 0. If an ETCS category E\mathbf{E} is a preorder, then E\mathbf{E} is degenerate.

Proof. Recall that a preorder is a category in which there is at most one morphism ABA \to B for any two objects A,BA, B. Every morphism in a preorder is vacuously monic. If there is a nonterminal set AA, then the monic A1A \to 1 to any terminal set defines a subset A1A \subseteq 1 distinct from the subset defined by 111 \to 1, thus giving (in an ETCS category) distinct classifying maps χ A,t:1P(1)\chi_A, t : 1 \to P(1), contradicting the preorder assumption. Therefore all objects AA are terminal. \Box

Assume from now on that E\mathbf{E} is a nondegenerate ETCS category.

Proposition 1. There are at least two truth values, i.e., two elements 1P(1)1 \to P(1), in E\mathbf{E}.

Proof. By proposition 0, there exist sets X,YX, Y and two distinct functions f,g:XYf, g : X \to Y. By the axiom of strong extensionality, there exists x:1Xx : 1 \to X such that fxgxf x \neq g x. The equalizer E1E \to 1 of the pair fx,gx:1Yf x, g x : 1 \to Y is then a proper subset of 11, and therefore there are at least two distinct elements χ E,χ 1:1P(1)\chi_E, \chi_1 : 1 \to P(1). \Box

Proposition 2. There are at most two truth values 1P(1)1 \to P(1); equivalently, there are at most two subsets of 11.

Proof. If U,V1U, V \subseteq 1 are distinct subsets of 11, then either UUVU \neq U \cap V or VUVV \neq U \cap V, say the former. Then 1 U:UU1_U : U \subseteq U and UVUU \cap V \subseteq U are distinct subsets, with distinct classifying maps χ 1 U,χ UV:UP(1)\chi_{1_U}, \chi_{U \cap V} : U \to P(1). By strong extensionality, there exists x:1Ux : 1 \to U distinguishing these classifying maps. Because 11 is terminal, we then infer 1U1 \subseteq U and U1U \subseteq 1, so U=1U = 1 as subsets of 11, and in that case only VV can be a proper subset of 11. \Box

By Propositions 1 and 2, there is a unique proper subset of the terminal object 11. Let 010 \to 1 denote this subset. Its domain may be called an “empty set”; by the preceding proposition, it has no proper subsets. The classifying map 1P11 \to P1 of 010 \subseteq 1 is the truth value we call “false”.

Proposition 3. 0 is an initial object, i.e., for any XX there exists a unique function 0X0 \to X.

Proof. Uniqueness: if f,g:0Xf, g : 0 \to X are maps, then their equalizer x:E0x : E \to 0, which is monic, must be an isomorphism since 0 has no proper subsets. Therefore f=gf = g. Existence: there are monos

01t XPX\displaystyle 0 \to 1 \stackrel{t_X}{\to} P X
XσPX\displaystyle X \stackrel{\sigma}{\to} P X

where t Xt_X is “global truth” (classifying the subset XXX \subseteq X) on XX and σ\sigma is the “singleton mapping x{x}x \mapsto \{ x \}” on XX, defined as the classifying map of the diagonal map δ:XX×X\delta : X \subseteq X \times X (last time we saw σ\sigma is monic). Take their pullback. The component of the pullback parallel to σ\sigma is a mono P0P \to 0 which again is an isomorphism, whence we get a map 0PX0 \cong P \to X using the other component of the pullback. \Box

Remark. For the “purists”, an alternative construction of the initial set 0 that avoids use of the strong extensionality axiom is to define the subset 010 \subseteq 1 to be “the intersection all subsets of 11”. Formally, one takes the extension [ϕ]1\left[\phi\right] \subseteq 1 of the map

ϕ=(1t P1PP1P1);\displaystyle \phi = (1 \stackrel{t_{P1}}{\to} PP1 \stackrel{\bigcap}{\to} P1);

where the first arrow represents the class of all subsets of P1P1, and the second is the internal intersection operator defined at the end of our last post. Using formal properties of intersection developed later, this intersection 010 \subseteq 1 has no proper subsets, and then the proof of proposition 3 carries over verbatim. \Box

Corollary 1. For any XX, the set 0×X0 \times X is initial.

Proof. By cartesian closure, maps 0×XY0 \times X \to Y are in bijection with maps of the form 0Y X0 \to Y^X, and there is exactly one of these since 0 is initial. \Box

Corollary 2. If there exists f:X0f : X \to 0, then XX is initial.

Proof. The composite of f,1 X:X0×X\langle f, 1_X \rangle : X \to 0 \times X followed by π 2:0×XX\pi_2 : 0 \times X \to X is 1 X1_X, and π 2\pi_2 followed by f,1 X:X0×X\langle f, 1_X \rangle : X \to 0 \times X is also an identity since 0×X0 \times X is initial by Corollary 1. Hence XX is isomorphic to an initial object 0×X0 \times X. \Box

By Corollary 2, for any object YY the arrow 0Y0 \to Y is vacuously monic, hence defines a subset.

Proposition 4. If X¬0X \not\cong 0, then there exists an element x:1Xx : 1 \to X.

Proof. Under the assumption, XX has at least two distinct subsets: 0X0 \subseteq X and 1 X:XX1_X : X \subseteq X. By strong extensionality, their classifying maps χ 0,χ 1:XP1\chi_0, \chi_1 : X \to P1 are distinguished by some element x:1Xx : 1 \to X.

External Unions and Internal Joins

One of the major goals in this post is to construct finite coproducts in an ETCS category. As in ordinary set theory, we will construct these as disjoint unions. This means we need to discuss unions first; as should be expected by now, in ETCS unions are considered locally, i.e., we take unions of subsets of a given set. So, let A,BXA, B \subseteq X be subsets.

To define the union ABXA \cup B \subseteq X, the idea is to take the intersection of all subsets containing AA and BB. That is, we apply the internal intersection operator (constructed last time ),

:PPXPX,\displaystyle \bigcap : P P X \to P X,
to the element 1PPX1 \to P P X that represents the set of all subsets of XX containing AA and BB; the resulting element 1PX1 \to P X represents ABA \cup B. The element 1PPX1 \to P P X corresponds to the intersection of two subsets {CPX:AC}{CPX:BC}PX\{ C \in P X : A \subseteq C \} \:\cap\: \{ C \in P X : B \subseteq C \} \subseteq P X.

Remark. Remember that in ETCS we are using generalized elements: CPXC \in P X really means a function C:UPXC : U \to P X over some domain UU, which in turn classifies a subset [C]U×X\left[C\right] \subseteq U \times X. On the other hand, the AA here is a subset AXA \subseteq X. How then do we interpret the condition “ACA \subseteq C”? We first pull back χ A:1PX\chi_A : 1 \to P X over to the domain UU (that is, we form the composite U!1χ APX\displaystyle U \stackrel{!}{\to} 1 \stackrel{\chi_A}{\to} P X) and consider the condition that this is bounded above by C:UPXC : U \to P X. (We will write χ AC\chi_A \leq C, thinking of the left side as constant over UU.) Externally, in terms of subsets, this corresponds to the condition U×A[C]U \times A \subseteq \left[C\right].

We need to construct the subsets {CPX:AC},{CPX:BC}\{ C \in P X : A \subseteq C \}, \{ C \in P X : B \subseteq C \}. In ZFC, we could construct those subsets by applying the comprehension axiom scheme, but the axioms of ETCS have no such blanket axiom scheme. (In fact, as we said earlier, much of the work on “internalizing logic” goes to show that in ETCS, we instead derive a comprehension scheme!) However, one way of defining subsets in ETCS is by taking loci of equations; here, we express the condition ACA \subseteq C, more pedantically A[C]A \subseteq \left[C\right] or χ AC\chi_A \leq C, as the equation

(χ AC)=t X(\chi_A \Rightarrow C) = t_X

where the right side is the predicate “true over XX”.

Thus we construct the subset {CPX:AC}\{C \in P X: A \subseteq C\} of PXP X via the pullback:

{C:AC} 1 t X PX χ A PX \begin{matrix} \{ C : A \le C \} & \longrightarrow & 1 \\ \downarrow & & \downarrow \mathrlap{\scriptsize{t_X}} \\ P X & \underset{\chi_A \Rightarrow -}{\longrightarrow} & P X \end{matrix}

Let me take a moment to examine what this diagram means exactly. Last time we constructed an internal implication operator

:P1×P1P1\Rightarrow : P1 \times P1 \to P1

and now, in the pullback diagram above, what we are implicitly doing is lifting this to an operator

X:PX×PXPX\Rightarrow_X : P X \times PX \to P X

The easy and cheap way of doing this is to remember the isomorphism PXP1 XP X \cong P1^X we used last time to uncover the cartesian closed structure, and apply this to

P1 X×P1 X(P1×P1) X XP1 X\displaystyle P1^X \times P1^X \cong (P1 \times P1)^X \stackrel{\Rightarrow^X}{\to} P1^X

to define X:PX×PXPX\Rightarrow_X : P X \times P X \to P X. This map classifies a certain subset of X×PX×PXX \times P X \times P X, which I’ll just write down (leaving it as an exercise which involves just chasing the relevant definitions):

{(x,T,S)X×PX×PX:x XTx XS}\{ (x, T, S) \in X \times P X \times P X : x \in_X T \Rightarrow x \in_X S \}

Remark. Similarly we can define a meet operator X:PX×PXPX\wedge_X : P X \times P X \to P X by exponentiating the internal meet P1×P1P1P1 \times P 1 \to P 1. It is important to know that the general Heyting algebra identities which we established last time for P1P1 lift to the corresponding identities for the operators X, X\wedge_X, \Rightarrow_X on PXP X. Ultimately this rests on the fact that the functor () X(-)^X, being a right adjoint, preserves products, and therefore preserves any algebraic identity which can be expressed as a commutative diagram of operations between such products.

Hence, for the fixed subset AXA \subseteq X (classified by χ A:1PX\chi_A : 1 \to P X), the operator

χ A:PXPX\chi_A \Rightarrow - : P X \to P X

classifies the subset

{(x,S):x XAx XS}X×PX\{ (x, S) : x \in_X A \Rightarrow x \in_X S \} \hookrightarrow X \times P X

Finally, in the pullback diagram above, we are pulling back the operator χ A\chi_A \Rightarrow - against t Xt_X. But, from last time, that was exactly the method we used to construct universal quantification. That is, given a subset

RX×YR \subseteq X \times Y

we defined YRX\forall_Y R \subseteq X to be the pullback of t Y:1PYt_Y : 1 \hookrightarrow PY along χ R:XPY\chi_R : X \to PY.

Putting all this together, the pullback diagram above expresses the definition

{CPX:AC}:={CPX: xXx XAx XC}\{ C \in P X : A \subseteq C \} := \{ C \in P X : \forall_{x \in X} \: x \in_X A \Rightarrow x \in_X C \}

that one would expect “naively”.

Now that all the relevant constructions are in place, we show that ABA \cup B is the join of AA and BB in the poset Sub(X)Sub(X). There is nothing intrinsically difficult about this, but as we are still in the midst of constructing the internal logic, we will have to hunker down and prove some logic things normally taken for granted or zipped through without much thought. For example, the internal intersection operator was defined with the help of internal universal quantification, and we will need to establish some formal properties of that.

Here is a useful general principle for doing internal logic calculations. Let χ R:XPY\chi_R : X \to PY be the classifying map of a subset RX×YR \subseteq X \times Y, and let f:UXf : U \to X be a function. Then the composite χ Rf:UPY\chi_R f : U \to PY classifies the subset

(f×1 Y) 1(R)U×Y(f \times 1_Y)^{-1}(R) \subseteq U \times Y

so that one has the general identity χ Rf=χ (f×1) 1(R)\chi_R f = \chi_{(f \times 1)^{-1}(R)}. In passing back and forth between the external and internal viewpoints, the general principle is to try to render “complicated” functions UPYU \to PY into a form χ Rf\chi_R f which one can more easily recognize. For lack of a better term, I’ll call this the “pullback principle”.

Lemma 1. Given a relation RX×YR \hookrightarrow X \times Y and a constant c:1Yc : 1 \to Y, there is an inclusion

YR(1 X×c) 1(R)\forall_Y R \subseteq (1_X \times c)^{-1}(R)

as subsets of XX. (In traditional logical syntax, this says that for any element c:1Yc : 1 \to Y,

y:YR(x,y)impliesR(x,c)\forall_{y : Y} R(x, y) \quad implies \quad R(x, c)

as predicates over elements xXx \in X. This is the type of thing that ordinarily “goes without saying”, but which we actually have to prove here!)

Proof. As we recalled above, YRX\forall_Y R \subseteq X was defined to be χ R 1(t Y)\chi_R^{-1}(t_Y), the pullback of global truth t Y:1PYt_Y : 1 \to PY along the classifying map χ R:XPY\chi_R : X \to PY. Hold that thought.


Pc:PYP1Pc : PY \to P1

be the map which classifies the subset {SPY:c YS}\{ S \in PY : c \in_Y S\}. Equivalently, this is the map

P1 c:P1 YP1 1P1^c: P1^Y \to P1^1

under the canonical isomorphisms PYP1 YPY \cong P1^Y, P1 1P1P1^1 \cong P1. Intuitively, this maps ff(c)f \mapsto f(c), i.e., plugs an element cYc \in Y into an element fP1 Yf \in P1^Y.

Using the adjunction (×Y)() Y(- \times Y) \dashv (-)^Y of cartesian closure, the composite

Xχ RPYPcP1\displaystyle X \stackrel{\chi_R}{\to} PY \stackrel{Pc}{\to} P1

transforms to the composite

X×11 X×cX×Yχ RP1X \times 1 \stackrel{1_X \times c}{\to} X \times Y \stackrel{\chi_{R}}{\to} P1

so by the pullback principle, (Pc)χ R:X×1P1(Pc)\chi_R : X \times 1 \to P1 classifies (1 X×c) 1(R)X×1X(1_X \times c)^{-1}(R) \subseteq X \times 1 \cong X.


(1 X×c) 1(R)=χ R 1(Pc) 1(t)(1)(1_X \times c)^{-1}(R) = \chi_{R}^{-1} (Pc)^{-1}(t) \qquad (1)

Also, as subsets of PYPY, we have the inclusion

(t Y:1PY)(Pc) 1(t)(2)(t_Y : 1 \hookrightarrow PY) \subseteq (Pc)^{-1}(t) \qquad (2)

[this just says that belongs to the subset classified by , or equivalently that is in the subset ]. Applying the pullback operation χ R 1\chi_{R}^{-1} to (2), and comparing to (1), Lemma 1 follows. \Box

Lemma 2. If RSR \subseteq S as subsets of X×YX \times Y, then YR YS\forall_Y R \subseteq \forall_Y S.

Proof. From the last post, we have an adjunction:

T×YSifandonlyifT YST \times Y \subseteq S \quad if and only if \quad T \subseteq \forall_Y S

for any subset of TXT \subseteq X. So it suffices to show YR×YS\forall_Y R \times Y \subseteq S. But

YR×YRS\forall_Y R \times Y \subseteq R \subseteq S

where the first inclusion follows from YR YR\forall_Y R \subseteq \forall_Y R. \Box

Next, recall from the last post that the internal intersection of F:1PPXF : 1 \to P P X was defined by interpreting the following formula on the right:

F= SPX(S PXF)(x XS)\displaystyle \bigcap F = \forall_{S \in P X} (S \in_{P X} F) \Rightarrow (x \in_X S)

Lemma 3. If FG:1PPXF \leq G : 1 \to P P X, then GF:1PX\displaystyle \bigcap G \leq \bigcap F : 1 \to P X.

Proof. F:1PPXF : 1 \to P P X classifies the subset {SPX:S PXF}\{ S \in P X : S \in_{P X} F \}, i.e., FF is identified with the predicate S PXFS \in_{P X} F in the argument SS, so by hypothesis (S PXF)(S PXG)(S \in_{P X} F) \leq (S \in_{P X} G) as predicates on SS. Internal implication aba \Rightarrow b is contravariant in the argument aa [see the following remark], so

[(SG)(xS)][(SF)(xS)]\left[(S \in G) \Rightarrow (x \in S)\right] \leq \left[(S \in F) \Rightarrow (x \in S)\right]
Now apply Lemma 2 to complete the proof. \Box

Remark. The contravariance of b- \Rightarrow b, that is, the fact that xyx \leq y implies (yb)(xb)(y \Rightarrow b) \leq (x \Rightarrow b), is a routine exercise using the adjunction [discussed last time] acba \wedge c \leq b if and only if c(ab).c \leq (a \Rightarrow b).

Indeed, we have

x(yb)y(yb)b(*)x \wedge (y \Rightarrow b) \leq y \wedge (y \Rightarrow b) \leq b \qquad (*)

where the first inequality follows from the hypothesis xyx \leq y, and the second follows from ybyby \Rightarrow b \leq y \Rightarrow b. By the adjunction, the inequality (*)(*) implies (yb)(xb)(y \Rightarrow b) \leq (x \Rightarrow b). \Box

Theorem 1. For subsets A,BA, B of XX, the subset ABA \cup B is an upper bound of AA and BB, i.e., A,BABA, B \subseteq A \cup B.

Proof. It suffices to prove that A={CPX:AC}\displaystyle A = \bigcap \{C \in P X : A \subseteq C\}, since then we need only apply Lemma 3 to the trivially true inclusion

{CPX:AC}{CPX:BC}{CPX:AC}\{ C \in P X : A \subseteq C \} \:\cap\: \{ C \in P X : B \subseteq C \} \subseteq \{ C \in P X : A \subseteq C \}

to infer AABA \subseteq A \cup B, and similarly BABB \subseteq A \cup B. (Actually, we need only show A{CPX:AC}\displaystyle A \subseteq \bigcap \{ C \in P X : A \subseteq C \}. We’ll do that first, and then show full equality.)

The condition we want,

A{xX: SPX(AS)(x XS)},A \subseteq \{ x \in X : \forall_{S \in P X} (A \subseteq S) \Rightarrow (x \in_X S) \},

is, by the adjunction (×PX) PX:Sub(X×PX)Sub(X)(- \times P X) \dashv \forall_{P X} : Sub(X \times P X) \to Sub(X), equivalent to

A×PX{(x,S):AS(x XS)}A \times P X \subseteq \{ (x, S) : A \subseteq S \Rightarrow (x \in_X S) \}

which, by a \wedge-\Rightarrow adjunction, is equivalent to

(A×PX)(X×{SPX:AS}) X(1)(A \times P X) \cap (X \times \{ S \in P X : A \subseteq S \}) \subseteq \: \in_X \qquad (1)

as subsets of X×PXX \times P X. So we just have to prove (1). At this point we recall, from our earlier analysis, that

{SPX:AS}={SPX: xXx XAx XS}\{ S \in P X : A \subseteq S \} = \{ S \in P X : \forall_{x \in X} x \in_X A \Rightarrow x \in_X S \}

Using the adjunction (X×) X(X \times -) \dashv \forall_X, as in the proof of Lemma 2, we have

X×{SPX: xXx XAx XS}X \times \{ S \in P X : \forall_{x \in X} x \in_X A \Rightarrow x \in_X S \}
{(x,S):x XAx XS}:=(A×PX) X,\subseteq \{ (x, S) : x \in_X A \Rightarrow x \in_X S \} := (A \times P X) \Rightarrow \in_X,

which shows that the left side of (1) is contained in

(A×PX)((A×PX) X) X,(A \times P X) \cap ((A \times P X) \Rightarrow \in_X) \subseteq \: \in_X,

where the last inclusion uses another \wedge-\Rightarrow adjunction. Thus we have established (1) and therefore also the inclusion

A{SPX:AS}\displaystyle A \subseteq \bigcap \{S \in P X : A \subseteq S\}

Now we prove the opposite inclusion

{SPX:AS}A,\displaystyle \bigcap \{ S \in P X : A \subseteq S \} \subseteq A,

that is to say

{xX: SPXASx XS}A(**)\{ x \in X : \forall_{S \in P X} A \subseteq S \Rightarrow x \in_X S \} \subseteq A \qquad (**)

Here we just use Lemma 1, applied to the particular element χ A:1PX\chi_A : 1 \to P X: we see that the left side of (**)(**) is contained in

{xX:A[χ A]x XA}\{ x \in X : A \subseteq \left[\chi_A\right] \Rightarrow x \in_X A\}

which collapses to {xX:x XA}=A\{ x \in X : x \in_X A \} = A, since A=[χ A]A = \left[\chi_A\right]. This completes the proof. \Box

Theorem 2. ABA \cup B is the least upper bound of A,BA, B, i.e., if CXC \subseteq X is a subset containing both AXA \subseteq X and BXB \subseteq X, then ABCA \cup B \subseteq C.

Proof. We are required to show that

{xX: SPX(ASBS)x XS}C.\{ x \in X : \forall_{S \in P X} \: (A \subseteq S \wedge B \subseteq S) \Rightarrow x \in_X S \} \subseteq C.

Again, we just apply Lemma 1 to the particular element χ C:1PX\chi_C : 1 \to P X: the left-hand side of the claimed inclusion is contained in

{xX:(ACBC)x XC}\{ x \in X : (A \subseteq C \wedge B \subseteq C) \Rightarrow x \in_X C \}

but since (ACBC)(A \subseteq C \wedge B \subseteq C) is true by hypothesis (is globally true as a predicate on the implicit variable xXx \in X), this last subset collapses to

{xX:t Xx XC}={xX:x XC}=C\{ x \in X : t_X \Rightarrow x \in_X C \} = \{ x \in X : x \in_X C \} = C

which completes the proof. \Box

Theorems 1 and 2 show that for any set XX, the external poset Sub(X)Sub(X) admits joins. One may go on to show (just on the basis of the topos axioms) that as in the case of meets, the global external operation of taking joins is natural in XX, so that by the Yoneda principle, it is classified by an internal join operation

:P1×P1P1,\vee : P1 \times P1 \to P1,

namely, the map which classifies the union of the subsets

[π 1]=P1×11×tP1×P1\displaystyle \left[\pi_1\right] = P1 \times 1 \stackrel{1 \times t}{\hookrightarrow} P1 \times P1
[π 2]=1×P1t×1P1×P1\displaystyle \left[\pi_2\right] = 1 \times P1 \stackrel{t \times 1}{\hookrightarrow} P1 \times P1

and this operation satisfies all the expected identities. In short, P1P 1 carries an internal Heyting algebra structure, as does PXP1 XP X \cong P 1^X for any set XX.

We will come back to this point later, when we show (as a consequence of strong extensionality) that P1P1 is actually an internal Boolean algebra.

Construction of Coproducts

Next, we construct coproducts just as we do in ordinary set theory: as disjoint unions. Letting X,YX, Y be sets (objects in an ETCS category), a disjoint union of XX and YY is a pair of monos

i:XZj:YZi : X \hookrightarrow Z \qquad j : Y \hookrightarrow Z

whose intersection is empty, and whose union or join in Sub(Z)Sub(Z) is all of ZZ. We will show that disjoint unions exist and are essentially unique, and that they satisfy the universal property for coproducts. We will use the notation X+YX + Y for a disjoint union.

Theorem 3. A disjoint union of XX and YY exists.


It’s enough to embed X,YX, Y disjointly into some set CC, since the union of the two monos in Sub(C)Sub(C) would then be the requisite ZZ. The idea now is that if a disjoint union or coproduct X+YX+Y exists, then there’s a canonical isomorphism P(X+Y)PX×PYP(X + Y) \cong P X \times P Y. Since the singleton map

σ:X+YP(X+Y)PX×PY\sigma : X + Y \to P(X + Y) \cong P X \times P Y

is monic, one thus expects to be able to embed XX and YY disjointly into PX×PYP X \times P Y. Since we can easily work out how all this goes in ordinary naive set theory, we just write out the formulas and hope it works out in ETCS.

In detail, define i X:XPX×PYi_X : X \to P X \times P Y to be

XX×1σ X×χ 0PX×PY\displaystyle X \cong X \times 1 \stackrel{\sigma_X \times \chi_0}{\to} P X \times P Y

where σ X\sigma_X is the singleton mapping and χ 0\chi_0 classifies 0Y0 \hookrightarrow Y; similarly, define i Y:YPX×PYi_Y :Y \to P X \times P Y to be

Y1×Yχ 0×σ YPX×PY.\displaystyle Y \cong 1 \times Y \stackrel{\chi_0 \times \sigma_Y}{\to} P X \times P Y.

Clearly i Xi_X and i Yi_Y are monic, so to show disjointness we just have to show that their pullback is empty. But their pullback is isomorphic to the cartesian product of the pullbacks of the diagrams

Xσ XPXχ 011χ 0PYσ YY\displaystyle X \stackrel{\sigma_X}{\to} P X \stackrel{\chi_0}{\leftarrow} 1 \qquad 1 \stackrel{\chi_0}{\to} P Y \stackrel{\sigma_Y}{\leftarrow} Y

so it would be enough to show that each (or just one) of these two pullbacks is empty, let’s say the first.

Suppose given a map h:AXh: A \to X which makes the square

A 1 h χ 0 X σ X PX\array{ \arrayopts{\rowalign{center axis center}} \A & \longrightarrow & 1 \\ \mathllap{\scriptsize{h}} \downarrow & & \downarrow \mathrlap{\scriptsize{\chi_0}} \\ X & \underset{\dlap{\sigma_X}}{\longrightarrow} & P X }

commute. Using the pullback principle, the map A1χ 0PXA \to 1 \stackrel{\chi_0}{\to} P X classifies

0×AX×A0 \times A \hookrightarrow X \times A

which is just the empty subset. This must be the same subset as classified by σ Xh=χ δh\sigma_X h = \chi_{\delta} h (where δ:XX×X\delta : X \hookrightarrow X \times X is the diagonal), which by the pullback principle is

(1 X×h) 1(δ)X×A.(1_X \times h)^{-1}(\delta) \hookrightarrow X \times A.

An elementary calculation shows this to be the equalizer of the pair of maps

π 1,hπ 2:X×AX.\pi_1, h\pi_2 : X \times A \stackrel{\to}{\to} X.

So this equalizer EE is empty. But notice that h,1:AX×A\langle h, 1 \rangle : A \to X \times A equalizes this pair of maps. Therefore we have a map AE0A \to E \cong 0. By Corollary 2 above, we infer A0A \cong 0. This applies to the case where AA is the pullback, so the pullback is empty, as was to be shown.

Theorem 4. Any two disjoint unions of X,YX, Y are canonically isomorphic.


Suppose i:XZY:ji : X \rightarrow Z \leftarrow Y : j is a disjoint union. Define a map

ϕ=ϕ 1,ϕ 2:ZPX×PY\phi = \langle \phi_1, \phi_2 \rangle : Z \to P X \times P Y

where ϕ 1:ZPX\phi_1 : Z \to P X classifies the subset 1 X,i:XX×Z\langle 1_X, i \rangle : X \to X \times Z, and ϕ 2:ZPY\phi_2 : Z \to P Y classifies the subset 1 Y,j:YY×Z\langle 1_Y, j \rangle : Y \to Y \times Z. Applying the pullback principle, the composite ϕ 1i:XPX\phi_1 i : X \to P X classifies

(1 X×i) 1(1 X,i)X×X(1_X \times i)^{-1}(\langle 1_X, i \rangle) \hookrightarrow X \times X

which is easily seen to be the diagonal on XX. Hence ϕ 1i=σ X\phi_1 i = \sigma_X. On the other hand, ϕ 1j:YPX\phi_1 j : Y \to P X classifies the subset

(1 X×j) 1(1 X,i)X×Y(1_X \times j)^{-1}(\langle 1_X, i \rangle) \hookrightarrow X \times Y

which is empty because ii and jj are disjoint embeddings, so ϕ 1j=χ 0:YPX\phi_1 j = \chi_0 : Y \to P X. Similar calculations yield

ϕ 2i=χ 0:XPYϕ 2j=σ Y:YPY.\phi_2 i = \chi_0 : X \to P Y \qquad \phi_2 j = \sigma_Y : Y \to P Y.

Putting all this together, we conclude that ϕi=i X:XPX×PY\phi i = i_X : X \to P X \times P Y and ϕj=i Y:YPX×PY\phi j = i_Y : Y \to P X \times P Y, where i Xi_X and i Yi_Y were defined in the proof of Theorem 3.

Next, we show that ϕ\phi is monic. If not, then by strong extensionality, there exist distinct elements c,d:1Zc, d : 1 \to Z for which ϕ(c)=ϕ(d)\phi(c) = \phi(d); therefore, ϕ 1c=ϕ 1d:1PX\phi_1 c = \phi_1 d : 1 \to P X and ϕ 2c=ϕ 2d:1PY\phi_2 c = \phi_2 d : 1 \to PY. By the pullback principle, these equations say (respectively)

c 1(i)=d 1(i)1c 1(j)=d 1(j)1. c^{-1}(i) = d^{-1}(i) \hookrightarrow 1 \qquad c^{-1}(j) = d^{-1}(j) \hookrightarrow 1.

If c 1(i)=1c^{-1}(i) = 1, then both c,d:1Zc, d : 1 \to Z factor through the mono i:XZi : X \to Z. However, since ϕi=i X\phi i = i_X is monic, this would imply that ϕ(c)ϕ(d)\phi (c) \neq \phi (d), contradiction. Therefore c 1(i)=ci=0c^{-1}(i) = c \cap i = 0. By similar reasoning, cj=0c \cap j = 0. Therefore

i¬cj¬ci \subseteq \neg c \qquad j \subseteq \neg c

where ¬=(0):Sub(Z)Sub(Z)\neg = (- \Rightarrow 0) : Sub(Z) \to Sub(Z) is the negation operator. But then ij¬ci \cup j \subseteq \neg c. And since 1:ZZ1 : Z \hookrightarrow Z is the union iji \cup j by assumption, ¬c\neg c must be the top element Sub(Z)\top \in Sub(Z), whence c:1Zc : 1 \to Z is the bottom element 0. This contradicts the assumption that the topos is nondegenerate. Thus we have shown that ϕ\phi must be monic.

The argument above shows that ϕ:ZPX×PY\phi : Z \hookrightarrow P X \times P Y is an upper bound of i X:XPX×PYi_X : X \to P X \times P Y and i Y:YPX×PYi_Y : Y \to P X \times P Y in Sub(PX×PY)Sub(P X \times P Y). It follows that the join X+YX + Y constructed in Theorem 3 is contained in ϕ:ZPX×PY\phi : Z \to P X \times P Y, and hence can be regarded as the join of XX and YY in Sub(Z)Sub(Z). But ZZ is their join in Sub(Z)Sub(Z) by assumption of being a disjoint union, so the containment X+YZX + Y \subseteq Z must be an equality. The proof is now complete.

Theorem 5. The inclusions i X:XX+Yi_X : X \to X + Y, i Y:YX+Yi_Y : Y \to X + Y exhibit X+YX + Y as the coproduct of XX and YY.


Let f:XBf : X \to B, g:YBg : Y \to B be given functions. Then we have monos

1 X,f:XX×B1 Y,g:YY×B(1)\langle 1_X, f \rangle : X \hookrightarrow X \times B \qquad \langle 1_Y, g \rangle : Y \hookrightarrow Y \times B \qquad (1)

Now the operation ×B:Sub(C)Sub(C×B)- \times B : Sub(C) \to Sub(C \times B) certainly preserves finite meets, and also preserves finite joins because it is left adjoint to B:Sub(C×B)Sub(C)\forall_B : Sub(C \times B) \to Sub(C). Therefore this operation preserves disjoint unions; we infer that the monos

X×Bi X×B(X+Y)×BY×Bi Y×B(X+Y)×B(2)X \times B \stackrel{i_X \times B}{\to} (X + Y) \times B \qquad Y \times B \stackrel{i_Y \times B}{\to} (X + Y) \times B \qquad (2)

exhibit (X+Y)×B(X + Y) \times B as a disjoint union of X×B,Y×BX \times B, Y \times B. Composing the monos of (1) and (2), we have disjoint embeddings of XX and YY in (X+Y)×B(X + Y) \times B. Using Theorem 4, X+YX + Y is isomorphic to the join of these embeddings; this means we have an inclusion

1,h:X+Y(X+Y)×B\langle 1, h \rangle : X + Y \hookrightarrow (X + Y) \times B

whose restriction to XX yields i X,f\langle i_X, f \rangle and whose restriction to YY yields i Y,g\langle i_Y, g \rangle. Hence h:X+YBh : X + Y \to B extends ff and gg. It is the unique extension, for if there were two extensions h,hh, h', then their equalizer would be an upper bound of X,YX, Y in Sub((X+Y)×B)Sub((X + Y) \times B), contradicting the fact that X+YX + Y is the least upper bound. This completes the proof.

I think that’s enough for one day. I will continue to explore the categorical structure and logic of ETCS next time.

Last revised on December 12, 2021 at 05:07:11. See the history of this page for a list of all contributions to it.