nLab
Cantor-Schroeder-Bernstein theorem

Contents

Statement

The Cantor–Schroeder–Bernstein theorem says that the usual order relation on cardinalities of sets is antisymmetric. In other words, define an order on sets by XYX \leq Y if there exists a monomorphism f:XYf\colon X \to Y. Then, if both XYX \leq Y and YXY \leq X, there exists an isomorphism of sets XYX \cong Y.

The result is really only interesting in the absence of the axiom of choice (ACAC). With ACAC, it is a trivial corollary of the well-ordering theorem. However, the theorem actually requires only excluded middle, although it does not hold in constructive mathematics.

Proof

We prove that the Cantor–Schroeder–Bernstein theorem holds in a Boolean topos. The theorem is not however intuitionistically valid, in that it fails in some toposes, such as the topos Set Set^{\bullet \to \bullet} (the arrow category of SetSet); see Example 2 below.

Throughout we use ordinary set-theoretic reasoning which can be translated into the formal theory of toposes. (This can be formalized via the Mitchell–Benabou language, for instance.)

First, let’s try a little pedagogy. Somehow functions h:XY,h 1:YXh: X \to Y, h^{-1}: Y \to X are to be cooked up from injections f:XYf: X \to Y and g:YXg: Y \to X, so we might guess hh is to be defined as ff at least part of the time, and h 1h^{-1} as gg another part of the time. An ideal situation would be to have a set-up

f:ABf: A \stackrel{\sim}{\longrightarrow} B
\,
CD:gC \stackrel{\sim}{\longleftarrow} D: g

where A,CA, C are complementary subsets in XX and B,DB, D are complementary subsets in YY; then hh could be defined as ff on AA and as g 1g^{-1} on CC, and everything works out fine. How can we achieve this?

If we have this situation, then apparently B=f(A)B = f(A) (the direct image of AA), and D=¬B=¬f(A)D = \neg B = \neg f(A) (the complement of f(A)f(A)), and then C=g(D)=g(¬f(A))C = g(D) = g(\neg f(A)), and finally A,CA, C are required to be complementary, so we would need

A=¬g(¬f(A)).A = \neg g(\neg f(A)).

In other words, AA would be a fixed point of a suitable operation built from direct image and complementation operators. In fact, if we find such a fixed point AA, then the plan above would work without a hitch.

Perhaps the simplest fixed-point theorem for this purpose on the market is

Lemma (Knaster–Tarski fixed-point theorem)

Let ϕ:PXPX\phi\colon P X \to P X be an order-preserving map. Then there exists AA in PXP X for which ϕ(A)=A\phi(A) = A.

Proof

Let AA be the (internal) intersection of U={TPX:ϕ(T)T}U = \{T \in P X : \phi(T) \leq T\}. Since ATA \leq T for every TT in UU, we have ϕ(A)ϕ(T)T\phi(A) \leq \phi(T) \leq T for every TT in UU. Hence ϕ(A)A\phi(A) \leq A by definition of AA. Applying ϕ\phi again, we get ϕϕ(A)ϕ(A)\phi \phi(A) \leq \phi(A). Hence ϕ(A)\phi(A) belongs to UU. But then Aϕ(A)A \leq \phi(A) by definition of AA.

Remark

The preceding proof is valid in any topos (and so holds for SetSet even intuitionistically). It can be seen as a specialization to posets of a result of Lambek on the initial algebra of an endofunctor, saying that the structure maps of such initial algebras are necessarily isomorphisms. Here the initial algebra AA is (by construction) an initial fixed point.

Proof of Cantor–Schroeder–Bernstein

Suppose given two monos f:XYf: X \to Y, g:YXg: Y \to X. Let f:PXPY\exists_f: P X \to P Y denote direct image or existential quantification along ff, and let ¬ X:PXPX\neg_X: P X \to P X denote negation. Then the composite

ϕ=¬ X g¬ Y f:PXPX\phi = \neg_X \exists_g \neg_Y \exists_f: P X \to P X

is order-preserving, and so has a fixed point AA by the Knaster-Tarski lemma. Now define h:XYh: X \to Y by the rule

h(x) = f(x) if xA h(x) = g 1(x) if xA\array{ h(x) & = & f(x) & \text{ if }\; x \in A \\ h(x) & = & g^{-1}(x) & \text{ if }\; x \notin A }

(the multi-line definition is where we use the Boolean condition). The second line makes sense because ¬A\neg A is in the image of gg. The inverse of hh is

j(y) = f 1(y) ify f(A) j(y) = g(y) ify f(A)\array{ j(y) & = & f^{-1}(y) & \; \text{if} \; y \in \exists_f(A) \\ j(y) & = & g(y) & \; \text{if} \; y \notin \exists_f(A) }

That jj is inverse to hh uses the fact that ¬S= g¬ f(S)\neg S = \exists_g \neg \exists_f(S). The rest is obvious.

This classic proof is substantially the proof given in Johnstone’s Elephant, D4.1.11. The Boolean condition is not strictly speaking necessary, i.e., the principle of excluded middle (EMEM) does not logically follow from the Cantor–Schroeder–Bernstein statement since, for example, the latter holds vacuously (every mono is an iso) in the non-Boolean topos

FinSet CFinSet^C

where CC is any nontrivial finite category. But EMEM is certainly the most natural supposition to make.

Alternative construction of a fixed point

In some schools of thought, the proof using the Knaster-Tarski lemma would be criticized because that lemma makes use of an impredicative construction. However, the application made of it in the proof of the CSB theorem is only to ensure that the operator ¬ X g¬ Y f:PXPX\neg_X \exists_g \neg_Y \exists_f: P X \to P X has a fixed point. This objection can be countered by shopping around for a different fixed-point theorem, one which is predicatively and constructively valid.

A time-honored way of constructing a fixed point of an operator ϕ\phi is by taking a limit of a sequence of iterates of ϕ\phi that converges, provided that ϕ\phi preserves the limit. To this end, we find that specializing Adámek’s theorem (see initial algebra of an endofunctor) suits our purposes perfectly.

Lemma

If g:YXg: Y \to X is monic, then the operator f:PYPX\exists_f: P Y \to P X preserves limits of inverse chains ω opPY\omega^{op} \to P Y (i.e. intersections of decreasing sequences).

Proof

More generally, g\exists_g preserves connected limits, because it lifts through the inclusion i:PX g(1)PXi: P X \downarrow \exists_g(1) \hookrightarrow P X to an isomorphism PYPX g(1)P Y \stackrel{\sim}{\to} P X \downarrow \exists_g(1) (here 11 denotes the top element of PYP Y, aka YY), and ii preserves connected limits.

In more detail: by Frobenius reciprocity, we have gTS= g(Tg *S)\exists_g T \wedge S = \exists_g(T \wedge g^\ast S) for elements SS of PXP X and TT of PYP Y. Putting T=1T = 1, we get g1S= gg *S\exists_g 1 \wedge S = \exists_g g^\ast S, and so the composite

PX g1g *PY gPY g1P X \downarrow \exists_g 1 \stackrel{g^\ast}{\to} P Y \stackrel{\exists_g}{\to} P Y \downarrow \exists_g 1

is the identity. But since gg is monic, g * g:PYPYg^\ast \exists_g: P Y \to P Y is also the identity, which completes the proof.

Corollary

g=¬ X g¬ Y:PYPX\forall_g = \neg_X \exists_g \neg_Y: P Y \to P X preserves colimits of ω\omega-chains.

Naturally the left adjoint f\exists_f also preserves such colimits. So by the corollary, the composite ¬ X g¬ Y f:PXPX\neg_X \exists_g \neg_Y \exists_f: P X \to P X preserves colimits of ω\omega-chains.

Putting now A 0=0A_0 = 0 (the bottom element of PXP X), A 1=¬ X g¬ Y f(0)A_1 = \neg_X \exists_g \neg_Y \exists_f(0), and generally

(1)A n=(¬ X g¬ Y f) n(0), A_n = (\neg_X \exists_g \neg_Y \exists_f)^n(0),

we have A nA n+1A_n \subseteq A_{n+1} (apply the monotone operator (¬ X g¬ Y f) n(\neg_X \exists_g \neg_Y \exists_f)^n to the inclusion A 0=0A 1A_0 = 0 \subseteq A_1), and so ¬ X g¬ Y f\neg_X \exists_g \neg_Y \exists_f preserves the union of the chain A 0A 1A_0 \subseteq A_1 \subseteq \ldots,

(2)A= n0(¬ X g¬ Y f) n(0), A = \bigcup_{n \geq 0} (\neg_X \exists_g \neg_Y \exists_f)^n(0),

which implies that AA is a fixed point of ¬ X g¬ Y f\neg_X \exists_g \neg_Y \exists_f, as desired. In fact this AA is the minimal fixed point, just as in the conclusion of the Knaster-Tarski lemma. (Cf. initial algebra of an endofunctor, especially Adámek’s theorem.)

Beta reduced proof

The preceding proofs are sometimes considered too abstract to easily visualize, but this is slightly misleading: the second proof, involving the construction of a minimal fixed point as a countable limit, can be “beta-reduced” to produce one of the standard “concrete” proofs.

In a nutshell, the minimal fixed point of the operator ¬ g¬ f:PXPX\neg \exists_g \neg \exists_f: P X \to P X can be expressed as an alternating series of iterated direct images:

(3)XgY+gfXgfgY+ X - g Y + g f X - g f g Y + \ldots

where - stands for set-theoretic difference \setminus and ++ stands for the union \cup. The meaning of the infinite series is that we have a increasing sequence of those finite alternating sums with an even number of terms, starting with the empty sum (which is 00, the empty set):

00
\,
XgY,X - g Y,
\,
XgY+gfXgfgY,X - g Y + g f X - g f g Y,

etc., and the infinite series is interpreted as the countable union of this increasing sequence. Note that we have to be careful about the order of the appearance of ++ and -, but alternatively, letting \oplus be the addition in the Boolean ring PXP X (symmetric difference), we could write also the series as XgYgfXX \oplus g Y \oplus g f X \oplus \ldots in the ring, where we do not need to be fussy about order.

Most of this is a routine calculation, which for the most part boils down to the following observation:

Lemma

If B,DB, D are elements of PYP Y, with DBD \leq B, then g(BD)= gB gD\exists_g(B - D) = \exists_g B - \exists_g D. (With a similar statement for f\exists_f.)

The proof is left to the reader, but in brief, the injectivity of gg implies that g\exists_g preserves binary intersections and relative complements.

From here, if we write

¬ g¬ f(B)=Xg(YfB)=XgY+gfB=XgYgfB,\neg \exists_g \neg \exists_f(B) = X - g(Y - f B) = X - g Y + g f B = X \oplus g Y \oplus g f B,

then it is easily verified by induction that, referring to equation (1),

A n=(¬ g¬ f) n0= j=0 n(gf) jX j=0 ng(fg) jY.A_n = (\neg \exists_g \neg \exists_f)^n 0 = \bigoplus_{j=0}^n (g f)^j X \oplus \bigoplus_{j=0}^n g(f g)^j Y.

Thus, according to equation (2), the minimal fixed point AA is the union of the A nA_n which is how we are interpreting the series (3).

Now we set up a comparison with one of the standard proofs involving a back-and-forth argument?, say the one given in Wikipedia that is attributed to Julius König. The minimal fixed point is a union of finitary approximations

A 1=XgY,A_1 = X - g Y,
\,
A 2=XgY+gfXgfgY,A_2 = X - g Y + g f X - g f g Y,

etc. Elements in A 1A_1 are those which have no inverse images under gg. Elements in A 2A_2 are elements in XX to which (gf) 1(g f)^{-1} can be applied at most once before we hit an element of XX with no inverse image under g 1g^{-1}. Elements in the union A 1A 2A_1 \cup A_2 \cup \ldots are those which survive at most finitely many applications of (gf) 1(g f)^{-1} before hitting an element of XX with no inverse image under gg. In the terminology of the Wikipedia article, such elements xx in XX are called “XX-stoppers”, and these are exactly the elements for which h(x)h(x) (where recall hh is the bijection under construction) is defined to be f(x)f(x) in the Wikipedia article. For elements xx not in this fixed point AA (the non-XX-stoppers), our proof of CSB (via a minimal fixed point) defined h(x)h(x) to be g 1(x)g^{-1}(x), the same prescription that is used in the Wikipedia article.

Other prescriptions are possible. For example, one could dually construct a maximal fixed point of the operator ¬ g¬ f:PXPX\neg \exists_g \neg \exists_f: P X \to P X, using Lemma 2 to note that f\exists_f and the right adjoint g=¬ g¬\forall_g = \neg \exists_g \neg preserve limits of inverse ω\omega-chains, so that the maximal fixed point or terminal algebra of the endofunctor ¬ g¬ f\neg \exists_g \neg \exists_f could be constructed as an intersection n1(¬ g¬ f) n(1)\bigcap_{n \geq 1} (\neg \exists_g \neg \exists_f)^n(1). This could also be written as a series

XgY+gfXgfgY+gfgfXX - g Y + g f X - g f g Y + g f g f X - \ldots

except this time the series is interpreted as an intersection of a decreasing sequence whose partial sums have an odd number of terms:

1=X,1 = X,
\,
¬ g¬ f(1)=XgY+gfX,\neg \exists_g \neg \exists_f(1) = X - g Y + g f X,
\,
(¬ g¬ f) 2(1)=XgY+gfXgfgY+gfgfX,(\neg \exists_g \neg \exists_f)^2(1) = X - g Y + g f X - g f g Y + g f g f X,

etc. The difference between this and the minimal fixed point is the set n1(gf) n(X)\bigcap_{n \geq 1} (g f)^n(X), consisting of elements xx of XX that belong to a doubly infinite sequence or a cyclic sequence (in the terminology of the Wikipedia article). As remarked in that article, for such xx we have an option to define h(x)h(x) as f(x)f(x) or g 1(x)g^{-1}(x); in the present article we defined h(x)=f(x)h(x) = f(x) for all xx belonging to whichever fixed point AA is used, which includes points in doubly infinite sequences or cyclic sequences if AA is the maximal fixed point. In that case the remaining xx (belonging to the complement of the maximal fixed point) are mapped to g 1(x)g^{-1}(x).

Failure in toposes

Example

Counterexample 4 below shows that the CSB theorem fails in Brouwer's intuitionistic mathematics even for SetSet (since every function between the sets [0,1][0, 1] and \mathbb{R} must be continuous by Brouwer's continuity principle!). See also the discussion in Mac Lane-Moerdijk, VI.9, on toposes that realize Brouwer’s theorem.

Example

As mentioned above, the Cantor-Schroeder-Bernstein theorem fails in the arrow category Set Set^\to, whose objects are functions X 0X 1X_0 \to X_1 between sets and whose morphisms are commutative squares. For example, let XX be the object f:f: \mathbb{N} \to \mathbb{N} that takes nn \in \mathbb{N} to int(n/2)\mathrm{int}(n/2), where int(x)\mathrm{int}(x) is the greatest integer less than or equal to xx; let YY be the object g:g: \mathbb{N} \to \mathbb{N} that takes nn to Int((n+1)/2)\mathrm{Int}((n+1)/2), where Int(x)\mathrm{Int}(x) is the least integer greater than or equal to xx. Pretty clearly XX and YY are non-isomorphic, because g 1(0)g^{-1}(0) has cardinality 11 whereas all fibers of ff have cardinality 22. But, just by drawing pictures of these objects, it is easy to construct monomophisms i:XYi: X \to Y and j:YXj: Y \to X (e.g., define i 0(n)=n+1i_0(n) = n+1 and i 1(n)=n+1i_1(n) = n+1 for all nn, and define j 0(n)=n+1j_0(n) = n+1 for n>0n \gt 0, j 0(0)=0j_0(0) = 0, and j 1(n)=nj_1(n) = n for all nn).

Nor can one have internal existence of an isomorphism between XX and YY in this last example, since internal existence implies external existence as soon as the terminal object is (externally) projective.

In other categories

The CSB property holds in some other categories of interest (but arguably fails in many more). Some examples follow:

Example

The CSB property holds in the category of vector spaces and in the category of algebraically closed fields. See also this MO post by John Goodrick, where model-theoretic criteria come into play, sometimes under strengthenings of the notion of monomorphism (e.g., elementary embedding, split monomorphism) Some slides by Goodrick here go into more detail, giving connections between CSB and stable theories in the sense of Shelah.

Counterexample

On the other hand, the CSB property obviously fails in Top, since we have embeddings (0,1)[0,1]\mathbb{R} \cong (0,1) \to [0,1] \to \mathbb{R}, yet [0,1][0,1] \ncong \mathbb{R}. It fails in Grp (e.g., the free group on countably many generators embeds in the free group on two generators).

More examples and discussion can be found at this Secret Blogging Seminar post.

In a celebrated work, Timothy Gowers gave a negative solution in the case of Banach spaces.

Name

The CSB theorem was first stated by Georg Cantor, but his proof relied on the well-ordering theorem. The modern (choice-free) theorem was proved (independently) by Felix Bernstein? and Ernst Schröder?. It has been variously named after two or three of these in almost every possible combination, although Cantor (when mentioned at all) seems always to be mentioned first.

Wikipedia reports that Richard Dedekind had an (unpublished) proof in 1887, well before any announced proofs by Cantor, Schroeder, or Bernstein in 1895, 1896, 1897 respectively.

References

Revised on October 13, 2016 07:55:43 by Todd Trimble (67.81.95.215)