Publications 2014-12-24T16:44:38Z tag:ncatlab.org,2011-01-02:Publications An Instiki Wiki Instiki HomePage 2014-12-24T16:44:38Z 2011-01-02T03:15:12Z tag:ncatlab.org,2011-01-02:Publications,HomePage Adeel Khan

The Publications of the nnLab is a web-based journal for peer-reviewed publication of original research and expository writing on topics in mathematics and mathematical physics that are usefully discussed from the point of view of category theory and homotopy theory/higher category theory.


\,\,\,\,\, (For AuthorsFor RefereesFor Editors)


Contents

Published articles

Goals

The tools of category theory and higher category theory serve to organize other structures. There is a plethora of applications that have proven to be more transparent when employing the nPOV. Higher category theory has helped foster entire new fields of study that would have been difficult to conceive otherwise. The nLab is a place where researchers in this area are making notes about their field and their research, and writing up expositions, theorems, and proofs.

The goal of the Publications of the nnLab is as a means to feed such content – as well as other submissions that wouldn’t otherwise find their way into the nnLab – through peer-review to mark it explicitly as stable and reliable, and thus to make it citable.

Format

The Publications of the nnLab appear in the same wiki-format as the nLab in a way that facilitates thorough cross-hyperlinking. This allows nnPublications-articles to conveniently refer their readers to background material on the nnLab and allows nnLab-articles to point their readers to stable and peer-reviewed results and proofs in the nnPublications.

Contributions to the nnPublications may vary in scope and form. The traditional article format submitted to printed journals is welcome, but the underlying wiki-technology may eventually find its own most natural form. The fact that background material can easily be linked to on the nnLab opens the possibility to have peer-reviewed and published single theorems in the nnPublications. At the other end of the spectrum, since there are hardly any size restrictions, whole books can find their place here. What counts is the quality and reliability of the content and its useful interlinking with other content.

Editorial board

At the moment the Publications of the nnLab is in the developmental phase and a formal editorial board is yet to be set up. For the time being the nLab steering committee is the closest formal body responsible for the nnPublications.

The preliminary list of tentatively confirmed names on an editorial board to be formally set up is

Submission procedure

An author who wishes to submit any material for publication to the Publications of the nnLab should go through the following steps:

  1. Create the material to be submitted in separate pages on the nLab in the way any nnLab pages are created (see nLab HowTo for details).

  2. Notify the editorial board of the nPublications about which pages in which precise version (as given in the edit-history of the entry) are to be submitted.

    (The precise version datum is mandatory for a submission to peer-review, as nnLab entries are subject to potential perpetual edits.)

It is also possible to submit material for review in the form of a PDF document or arXiv entry, with the understanding that upon acceptance, conversion to nLab page(s) is a precondition for publication.

The submitted material will go through a refereeing process as usual in mathematical journals: specialist referees will be chosen by the editors. If the material is accepted, publication of the material proceeds as follows:

  1. The accepted version of the nnLab entries is copied over to the write-protected nnPublications web. (The nnLab version of the submitted material remains in place, but is subject to perpetual further edits, as is all material on the nnLab.)

  2. Hyperlinks are added (ideally by nnPublications-staff, to the extent that such exists) to the nnLab version of the submitted article, pointing to the stable peer-reviewed version published in the nnPublications. Conversely, the nnPublications-version is equipped with a link back to the freely editable nnLab version.

Differences to traditional publication

Publication on a wiki-journal such as the Publications of the nnLab is meant to retain the purpose and advantages of traditional journal publishing, which is

  • peer-review

  • and official author recognition

but add to it additional value that serves the scientific purpose.

Transparent refereeing

Traditional refereeing often degenerates to a formal procedure that serves the academic machinery more than the scientific purpose. On the nnPublications we allow – if desired by referees – transparent refereeing that adds genuine scientific value to the process and its outcome. A single submitted article may receive one or more of the following stamps of approval.

Every article published in the Publications of the nnLab carries an indication which kind of refereeing precisely it did receive. It may say:

  • This entry was refereed by nn anonymous referees chosen and assumed to be expert by the editorial board.

  • The following people say that they read the entry and think that it is okay: name1, name2.

  • Contributior name3 started refereeing the article but ended up reworking it and adding to it substantially. The resulting new version can be found at the following link…

That link may point to an unrefereed nnLab article, or again to the nnPublications, where it may appear with its own list of people who looked at it.

As mentioned previously, due to its presentation in a wiki format, papers in the Publications of the nnLab can make use of hyperlinks in a way which is not possible for papers published in a paper journal. Of course, references to other sections, theorems, definitions, and references can be linked, as is possible in PDF files using hypertex. But going beyond this, the material does not even have to be “projected in a totally ordered way onto the page axis” (in the immortal words of Serge Lang). It can be organized in a nonlinear way enabling readers to choose their own path through it more, “zooming in” by clicking on links to read more about those parts that interest them most.

Iterated resubmission

One advantage of a wiki such as the nLab over more static forms of presentation and publication is that it admits and encourages iterative improvement. No text is ever perfect and up-to-date, but on a wiki it can at least approach such a state asymptotically.

In order to have the write-protected Publications of the nnLab take part in this perpetual improvement, iterated resubmissions are encouraged:

when the perpetual editing process of the nnLab-version of an article published in the nnPublications (by the original author or by other contributors) is recognized to have produced a significant improvement of the previous version (be it significant addition of new material or improvement of content or presentation of the original material), the original author or any other contributor is encouraged to resubmit a newer version of the page to the peer-review of the nnPublications. It will be fed through the review process as usual, and if review is successful in that the new version is deemed by the referee(s) to be a genuine improvement on the former version, it replaces the former version on the nnPublications. (Notice that no material on the wiki is ever deleted, all previous versions of any page are retained in the entry’s edit-history, accessible via a link at the bottom of every page.)

Publication of resubmissions are handled in a way that does not affect stability of citations. If a resubmission does not affect any previous citations to its content (for instance in the case that just an additional theorem is added) then it replaces the original page. If it does affect previous citations (for instance in the case that assumptions of theorems are being changed, maybe for fine-tuning the presentation) then the resubmission is published parallel to its earlier version.

Massively collaborative and third-party publishing

For science and for humanity, what counts is not the authorship of and fame and credit for a result. What counts is the result.

By its very nature, content on a wiki such as the nLab is potentially massively multi-authored and massively inter-linked with other pages to an extent that the resulting content may not be attributable to a single or even a handful of authors (even though every single edit is precisely attributable by the automatic edit history, the proverbial whole is typically more than the sum of these pieces). The result of this process is potentially of higher value than what any single contributing author could have achieved, and the nnPublications is intended to make use of this.

Therefore the Publications of the nnLab allows author-independent submission: if at any time a user of the nnLab finds that any given version of any given nnLab-page deserves formal peer-review, the user may submit that version to the nnPublication as above.

The editorial board will decide if the submitted version indeed justifies feeding it through formal peer-review. It may suggest to the submitting user to wait with this until further improvements have been implemented.

Republication in other Journals

Publication in the Publications of the nLab does not preclude publication elsewhere. Initially, at least, we expect that publications in the Publications of the nLab will also be published elsewhere (either before or after their appearance in the nnPublications ), and we rely on the authors to count them as only “one publication” on their CV with both places of publication listed. However, it will not be possible, in any case, to withdraw a paper once it has been published in the nnPublications. The advantages to the author of also publishing in the Publications of the nnLab include transparent refereeing, community input and potential branching, wide exposure, and availability of hypertext. Eventually, we hope that the Publications of the nnLab will become respected enough that articles published there will be included into the main mathematics publications databases without the need for republication elsewhere.

FiorenzaMartinengo2012 2012-09-17T12:24:48Z 2012-08-23T19:08:25Z tag:ncatlab.org,2012-08-23:Publications,FiorenzaMartinengo2012 Urs Schreiber

Click here to show links to nLab pages.

Domenico Fiorenza, Elena Martinengo, A short note on \infty-groupoids and the period map for projective manifolds, Publications of the nLab vol. 2 no. 1 (2012) arXiv:0911.3845

A short note on \infty -groupoids and the period map for projective manifolds

[[nLab:Domenico Fiorenza]] and [[nLab:Elena Martinengo]]

Dipartimento di Matematica - Sapienza, Università di Roma; P.le Aldo Moro 5, I-00185 Roma Italy - fiorenza@mat.uniroma1.it

Institut für Mathematik und Informatik, Freie Universität Berlin, Arnimallee 3, 14195 Berlin, Germany - elenamartinengo@gmail.com

Abstract

We show how several classical results on the infinitesimal behaviour of the period map for smooth projective manifolds can be read in a natural and unified way within the framework of ∞-categories.

Introduction

A common criticism of ∞-categories in algebraic geometry is that they are an extremely technical subject, so abstract to be useless in everyday mathematics. The aim of this note is to show in a classical example that quite the converse is true: even a naïve intuition of what an ∞-groupoid should be clarifies several aspects of the infinitesimal behaviour of the periods map of a projective manifold. In particular, the notion of Cartan homotopy turns out to be completely natural from this perspective, and so classical results such as Griffiths’ expression for the differential of the periods map, the Kodaira principle on obstructions to deformations of projective manifolds, the Bogomolov-Tian-Todorov theorem, and Goldman-Millson quasi-abelianity theorem are easily recovered.

The use of the language of ∞-categories should not be looked at as providing new proofs for these results; namely, up to a change in language, our proofs verbatim reproduce arguments from the recent literature on the subject, particularly from the work of Marco Manetti and collaborators on dglas in deformation theory. Rather, by this change of language we change our point of view on the classical theorems above: in the perspective of ∞-sheaves from Lu09a, all these theorems have a very simple local nature which can be naturally expressed in terms of ∞-groupoids (or, equivalently, of dglas); their classical global counterparts are then obtained by taking derived global sections. It is worth remarking that, if one prefers proofs which do not rely on the abstract machinery of ∞-categories, one can rework the arguments of this note in purely classical terms. Namely, once the abstract \infty-nonsense has suggested the “correct” local dglas, one can globalize them by means of an explicit model for the derived global sections, e.g., via resolutions by fine sheaves as in FM09, or by the Thom-Sullivan-Whitney model as in IM10.

Since most of the statements and constructions we recall in the paper are well known in the (,1)(\infty ,1)-categorical folklore, despite our efforts in giving credit, it is not unlikely we may have misattributed a few of the results; we sincerely apologize for this. We thank the referee for accurate remarks which helped us a lot in improving the present paper, and Ezra Getzler, Donatella Iacono, Marco Manetti, Jonathan Pridham, Carlos Simpson, Jim Stasheff, Bruno Vallette, Gabriele Vezzosi, and the nnLab for suggestions and several inspiring conversations on the subject of this paper.

Through the whole paper, 𝕂\mathbb{K} is a fixed characteristic zero field, all algebras are defined over 𝕂\mathbb{K} and local algebras have 𝕂\mathbb{K} as residue field. In order to keep our account readable, we will gloss over many details, particularly where the use of higher category theory is required.

From dglas to \infty-groupoids and back again

With any nilpotent dgla 𝔤\mathfrak{g} is naturally associated the simplicial set

MC(𝔤Ω ), \MC(\mathfrak{g}\otimes \Omega _{\bullet }),

where MC\MC stands for the Maurer-Cartan functor mapping a dgla to the set of its Maurer-Cartan elements, and Ω \Omega _{\bullet } is the simplicial differential graded commutative associative algebra of polynomial differential forms on algebraic nn-simplexes, for n0n\geq 0. The importance of this construction, which can be dated back to Sullivan’s Su77, relies on the fact that, as shown by Hinich and Getzler Hi97, Ge09, the simplicial set MC(𝔤Ω )\MC(\mathfrak{g}\otimes \Omega _{\bullet }) is a Kan complex, or -to use a more evocative name- an ∞-groupoid. A convenient way to think of ∞-groupoids is as homotopy types of topological spaces; namely, it is well known1 that any ∞-groupoid can be realized as the ∞-Poincaré groupoid, i.e., as the simplicial set of singular simplices, of a topological space, unique up to weak equivalence. Therefore, the reader who prefers to can substitute homotopy types of topological spaces for equivalence classes of ∞-groupoids. To stress this point of view, we’ll denote the kk-truncation of an ∞-groupoid X\mathbf{X} by the symbol π kX\pi _{\leq k}\mathbf{X}. More explicitely, π kX\pi _{\leq k}\mathbf{X} is the kk-groupoid whose jj-morphisms are the jj-morphisms of X\mathbf{X} for j<kj\lt k, and are homotopy classes of jj-morphisms of X\mathbf{X} for j=kj=k. In particular, if X\mathbf{X} is the ? spring groupoid of a topological space XX, then π 0X\pi _{\leq 0}\mathbf{X} is the set π 0(X)\pi _{0}(X) of path-connected components of XX, and π 1X\pi _{\leq 1}\mathbf{X} is the usual Poincaré groupoid of XX.

The next step is to consider an (,1)(\infty ,1)-category, i.e., an ∞-category whose hom-spaces are ∞-groupoids. This can be thought as a formalization of the naïve idea of having objects, morphisms, homotopies between morphisms, homotopies between homotopies, et cetera. In this sense, endowing a category with a model structure should be thought as a first step towards defining an (,1)(\infty ,1)-category structure on it.

Turning back to dglas, an easy way to produce nilpotent dglas is the following: pick an arbitrary dgla 𝔤\mathfrak{g}; then, for any differential graded local Artin algebra AA, take the tensor product 𝔤𝔪 A\mathfrak{g}\otimes \mathfrak{m}_{A}, where 𝔪 A\mathfrak{m}_{A} is the maximal ideal of AA. Since both constructions

DGLA×dgArt nilpotentDGLA (𝔤,A) 𝔤𝔪 A \begin{aligned} \mathbf{DGLA}\times \mathbf{dgArt}&\to \mathbf{nilpotent\,\, DGLA}\\ (\mathfrak{g},A)&\mapsto \mathfrak{g}\otimes \mathfrak{m}_{A} \end{aligned}

and

nilpotentDGLA -Grpd 𝔤 MC(𝔤Ω ) \begin{aligned} \mathbf{nilpotent\,\, DGLA}&\to \mathbf{\infty \text{-Grpd}}\\ \mathfrak{g}&\mapsto \MC(\mathfrak{g}\otimes \Omega _{\bullet }) \end{aligned}

are functorial, their composition defines a functor

Def:DGLAFormal -Grpd, \Def: \mathbf{DGLA} \to \mathbf{\text{Formal }\infty \text{-Grpd}},

where, by definition, a formal ∞-groupoid is a functor dgArt-Grpd\mathbf{dgArt}\to \mathbf{\infty \text{-Grpd}}. Note that π 0(Def(𝔤))\pi _{\leq 0}(\Def(\mathfrak{g})) is the usual set valued deformation functor associated with 𝔤\mathfrak{g}, i.e., the functor

AMC(𝔤𝔪 A)/gauge, A\mapsto \MC(\mathfrak{g}\otimes \mathfrak{m}_{A})\bigl /gauge,

where the gauge equivalence of Maurer-Cartan elements is induced by the gauge action

e α*x=x+ n=0 (ad α) n(n+1)!([α,x]dα) e^{\alpha }*x=x+\sum _{n=0}^{\infty }\frac{(ad_{\alpha })^{n}}{(n+1)!} ([\alpha ,x]-d\alpha )

of exp(𝔤 0𝔪 A)\exp (\mathfrak{g}^{0}\otimes \mathfrak{m}_{A}) on the subset MC(𝔤𝔪 A)\MC(\mathfrak{g}\otimes \mathfrak{m}_{A}) of 𝔤 1𝔪 A\mathfrak{g}^{1}\otimes \mathfrak{m}_{A}. However, due to the presence of nontrivial irrelevant stabilizers, the groupoid π 1(Def(𝔤))\pi _{\leq 1}(\Def(\mathfrak{g})) is not equivalent to the action groupoid MC(𝔤𝔪 A)//exp(𝔤 0𝔪 A)\MC(\mathfrak{g}\otimes \mathfrak{m}_{A})\bigl /\bigl /\exp (\mathfrak{g}^{0}\otimes \mathfrak{m}_{A}), unless 𝔤\mathfrak{g} is concentrated in nonnegative degrees. We will come back to this later. Also note that the zero in 𝔤 1𝔪 A\mathfrak{g}^{1}\otimes \mathfrak{m}_{A} gives a natural distinguished element in π 0(Def(𝔤))\pi _{\leq 0}(\Def(\mathfrak{g})): the isomorphism class of the trivial deformation. Since this marking is natural, we will use the same symbol π 0(Def(𝔤))\pi _{\leq 0}(\Def(\mathfrak{g})) to denote both the set π 0(Def(𝔤))\pi _{\leq 0}(\Def(\mathfrak{g})) and the pointed set π 0(Def(𝔤);0)\pi _{\leq 0}(\Def(\mathfrak{g});0).

It is important to remark that the functors of the form Def(𝔤)\Def(\mathfrak{g}) are very special ones among all formal ∞-groupoids. To begin with, Def(𝔤)(𝕂)={0}\Def(\mathfrak{g})(\mathbb{K})=\{ 0\} and so, in particular, Def(𝔤)(𝕂)\Def(\mathfrak{g})(\mathbb{K}) is a homotopically trivial ∞-groupoid. Another characterzing property of the functors of the form Def(𝔤)\Def(\mathfrak{g}) among formal ∞-groupoids is that, under suitable assumptions, they commute with homotopy pullbacks; see Pr10, Lu11 for a precise statement. In other words, if we call “formal moduli problems” those formal ∞-groupoids which satisfy the two conditions we have just observed for Def(𝔤)\Def(\mathfrak{g}), what we are saying is that Def\Def is actually a functor

Def:DGLAFormal moduli problems. \Def: \mathbf{DGLA} \to \mathbf{\text{Formal moduli problems}}.

And a very good reason for working with ∞-groupoids valued deformation functors rather than with their apparently handier set-valued or groupoid-valued versions is the following remarkable result, which allows one to move homotopy constructions back and forth between dglas and formal moduli problems.

Theorem

(Pridham-Lurie) The functor Def:DGLAFormal moduli problems\Def: \mathbf{DGLA} \to \mathbf{\text{Formal moduli problems}} is an equivalence of (,1)(\infty ,1)-categories.

Here the (,1)(\infty ,1)-category structures involved are the most natural ones, and they are both induced by standard model category structures. Namely, on the category of dglas one takes surjective morphisms as fibrations and quasi-isomorphisms as weak equivalences, just as in the case of differential complexes, whereas the model category structure on the right hand side is induced by the standard model category structure on Kan complexes as a subcategory of simplicial sets. A proof of the above equivalence can be found in Pr10, Lu11.

We will often identify a dgla 𝔤\mathfrak{g} with the functor dgArtnilpotentDGLA\mathbf{dgArt}\to \mathbf{nilpotent\,\, DGLA} it defines by the rule A𝔤𝔪 AA\mapsto \mathfrak{g}\otimes \mathfrak{m}_{A}. With this in mind, we will occasionally apply constructions that generally only make sense for nilpotent dglas (such as exp\exp ) to arbitrary dglas. What we mean in these cases is that the construction is applied not to a single dgla, but to the functor from dgArt\mathbf{dgArt} to nilpotent dglas it defines. The same kind of consideration applies to our somehow colloquial use of the expression “∞-groupoid” in the following sections; namely, by that we will occasionally mean “formal ∞-groupoid”, or even “formal stack in ∞-groupoids”. The precise meaning to be given to “∞-groupoid” will always be clear from the context.

Tangent spaces and obstructions

If X\mathbf{X} is a formal moduli problem, then the simplicial set X(𝕂[ϵ]/(ϵ 2))\mathbf{X}(\mathbb{K}[\epsilon ]/(\epsilon ^{2})) has a natural structure of simplicial vector space, and so, via the Dold-Kan correspondence, it is equivalent to the datum of a chain complex: the tangent complex TXT\mathbf{X} of X\mathbf{X}. Passing from X\mathbf{X} to the associated classical moduli problem π 0X\pi _{\leq 0}\mathbf{X}, the only datum we read of the tangent complex is its homotopy class, i.e., since we are working on a field, its cohomology. In particular, we have a natural isomorphism

Tπ 0DefH 1 T\pi _{\leq 0}\Def\xrightarrow{\sim } H^{1}

of functors DGLAVector spaces\mathbf{DGLA}\to \mathbf{\text{Vector spaces}} between the tangent space to the classical moduli problem associated to a dgla and the first cohomology group of the dgla seen as a cochain complex. Let us rephrase this in a more explicit form. As we noticed in the previous section, π 0Def(𝔤)\pi _{\leq 0}\Def(\mathfrak{g}) is the functor of Artin rings AMC(𝔤𝔪 A)/gauge A\mapsto \MC(\mathfrak{g}\otimes \mathfrak{m}_{A})\bigl /gauge, hence

Tπ 0Def(𝔤)=MC(𝔤(ϵ)/(ϵ 2))/gaugeH 1(𝔤). T\pi _{\leq 0}\Def(\mathfrak{g})=\MC\left (\mathfrak{g}\otimes (\epsilon )/(\epsilon ^{2})\right )\bigl /gauge\cong H^{1}(\mathfrak{g}).

This isomorphism is natural. Namely, given a morphism φ:𝔤𝔥\varphi \colon \mathfrak{g}\to \mathfrak{h} of dglas, let us write Φ\Phi for the induced morphism of classical moduli problems,

Φ=π 0Def(φ):π 0Def(𝔤)π 0Def(𝔥). \Phi =\pi _{\leq 0}\Def(\varphi )\colon \pi _{\leq 0}\Def(\mathfrak{g})\to \pi _{\leq 0}\Def(\mathfrak{h}).

Then the differential of Φ\Phi ,

dΦ:Tπ 0Def(𝔤)Tπ 0Def(𝔥) d\Phi \colon T\pi _{\leq 0}\Def(\mathfrak{g})\to T\pi _{\leq 0}\Def(\mathfrak{h})

is naturally identified with

H 1(φ):H 1(𝔤)H 1(𝔥). H^{1}(\varphi ):H^{1}(\mathfrak{g})\to H^{1}(\mathfrak{h}).

The second cohomology group H 2H^{2} defines a natural obstruction theory for π 0Def\pi _{\leq 0}\Def, i.e., obstructions for the classical moduli problem π 0Def(𝔤)\pi _{\leq 0}\Def(\mathfrak{g}) are naturally identified with elements in H 2(𝔤)H^{2}(\mathfrak{g}), see Ma02. Note that this does not mean that each element in H 2(𝔤)H^{2}(\mathfrak{g}) represents an obstruction: one can have dglas with nontrivial H 2H^{2} governing unobstructed deformation problems. The naturally of the obstruction theory given by the second cohomology groups means that, if φ:𝔤𝔥\varphi \colon \mathfrak{g}\to \mathfrak{h} is a morphism of of dglas, the induced morphism in cohomology,

H 2(φ):H 2(𝔤)H 2(𝔥), H^{2}(\varphi ):H^{2}(\mathfrak{g})\to H^{2}(\mathfrak{h}),

maps obstructions for the classical moduli problem π 0Def(𝔤)\pi _{\leq 0}\Def(\mathfrak{g}) to obstructions for the classical moduli problem π 0Def(𝔥)\pi _{\leq 0}\Def(\mathfrak{h}). In particular, if the moduli problem π 0Def(𝔥)\pi _{\leq 0}\Def(\mathfrak{h}) is unobstructed (e.g., if the functor π 0Def(𝔥)\pi _{\leq 0}\Def(\mathfrak{h}) is smooth), then

Obstructions(π 0Def(𝔤))ker(H 2(φ):H 2(𝔤)H 2(𝔥)). \mathrm{Obstructions}\left (\pi _{\leq 0}\Def(\mathfrak{g})\right )\subseteq \ker \left (H^{2}(\varphi )\colon H^{2}(\mathfrak{g})\to H^{2}(\mathfrak{h})\right ).

Homotopy vs. gauge equivalent morphisms of dglas (with a detour into L L_\infty-morphisms)

Let 𝔤\mathfrak{g} and 𝔥\mathfrak{h} be two dglas. The hom-space Hom (𝔤,𝔥)\Hom_{\infty }(\mathfrak{g},\mathfrak{h}) of morphisms between 𝔤\mathfrak{g} and 𝔥\mathfrak{h} in the (,1)(\infty ,1)-category of dglas is conveniently modelled as the simplicial set MC(Hom̲(𝔤,𝔥)Ω )\MC(\underline{\Hom}(\mathfrak{g},\mathfrak{h})\otimes \Omega _{\bullet }), where Hom̲(𝔤,𝔥)\underline{\Hom}(\mathfrak{g},\mathfrak{h}) is the Chevalley-Eilenberg-type dgla associated with the pair (𝔤,𝔥)(\mathfrak{g},\mathfrak{h}). It is given as the total dgla of the bigraded dgla

Hom̲ p,q(𝔤,𝔥)=Hom Vect( q𝔤,𝔥[p])=Hom p( q𝔤,𝔥), \underline{\Hom}^{p,q}(\mathfrak{g},\mathfrak{h})=\Hom_{\mathbb{Z}-Vect}(\wedge ^{q}\mathfrak{g},\mathfrak{h}[p])=\Hom^{p}(\wedge ^{q}\mathfrak{g},\mathfrak{h}),

endowed with the Lie bracket

[,] Hom̲:Hom̲ p 1,q 1(𝔤,𝔥)Hom̲ p 2,q 2(𝔤,𝔥)Hom̲ p 1+p 2,q 1+q 2(𝔤,𝔥) [\,,\,]_{\underline{\Hom}}\colon \underline{\Hom}^{p_{1},q_{1}}(\mathfrak{g},\mathfrak{h})\otimes \underline{\Hom}^{p_{2},q_{2}}(\mathfrak{g},\mathfrak{h})\to \underline{\Hom}^{p_{1}+p_{2},q_{1}+q_{2}}(\mathfrak{g},\mathfrak{h})

defined by

[f,g] Hom̲ (γ 1 γ q 1+q 2 )= σSh(q 1,q 2)±[f(γ σ(1)γ σ(q 1)),g(γ σ(q 1+1)γ σ(q 1+q 2))] 𝔥 , [f,g]^{}_{\underline{\Hom}}(\gamma _{1}^{}\wedge \cdots \wedge \gamma _{q_{1}+q_{2}}^{})=\sum_{\sigma \in Sh(q_{1},q_{2})}\pm [f( \gamma _{\sigma (1)}\wedge \cdots \wedge \gamma _{\sigma (q_{1})}), g( \gamma _{\sigma (q_{1}+1)}\wedge \cdots \wedge \gamma _{\sigma (q_{1}+q_{2})})]^{}_{\mathfrak{h}},

with σ\sigma ranging in the set of (q 1,q 2)(q_{1},q_{2})-unshuffles and and ±\pm standing for the Koszul sign, and with the differentials

d 1,0 :Hom̲ p,q(𝔤,𝔥)Hom̲ p+1,q(𝔤,𝔥) d_{1,0}^{}\colon \underline{\Hom}^{p,q}(\mathfrak{g},\mathfrak{h})\to \underline{\Hom}^{p+1,q}(\mathfrak{g},\mathfrak{h})

and

d 0,1 :Hom̲ p,q(𝔤,𝔥)Hom̲ p,q+1(𝔤,𝔥) d_{0,1}^{}\colon \underline{\Hom}^{p,q}(\mathfrak{g},\mathfrak{h})\to \underline{\Hom}^{p,q+1}(\mathfrak{g},\mathfrak{h})

given by

(d 1,0 f)(γ 1 γ q )=d 𝔥(f(γ 1 γ q ))+ i±f(γ 1d 𝔤γ iγ q+1 ) (d_{1,0}^{}f)(\gamma _{1}^{}\wedge \cdots \wedge \gamma _{q}^{})=d_{\mathfrak{h}}(f(\gamma _{1}^{}\wedge \cdots \wedge \gamma _{q}^{}))+\sum _{i} \pm f(\gamma _{1}\wedge \cdots \wedge d_{\mathfrak{g}}\gamma _{i}\wedge \cdots \wedge \gamma _{q+1}^{})

and

(d 0,1 f)(γ 1γ q+1)= i<j±f([γ i,γ j] 𝔤 γ 1γ i^γ j^γ q+1 ). (d_{0,1}^{}f)(\gamma _{1}\wedge \cdots \wedge \gamma _{q+1})= \sum _{i\lt j}\pm f([\gamma _{i},\gamma _{j}]^{}_{\mathfrak{g}}\wedge \gamma _{1}\wedge \cdots \wedge \widehat{\gamma _{i}}\wedge \cdots \wedge \widehat{\gamma _{j}}\wedge \cdots \wedge \gamma _{q+1}^{}).

An explicit determination for the signs in the above formulas can be found, e.g, in LM95,Sc04. These operations are best seen pictorially:

<!-- Created with SVG-edit - http://svg-edit.googlecode.com/ --> Layer 1 [] ] f g , Hom f g = ; [,]

  1. At least in higher categories folklore.

2012 2012-09-17T12:24:15Z 2012-08-23T19:01:38Z tag:ncatlab.org,2012-08-23:Publications,2012 Urs Schreiber

2011, 2012


Publications of the nLab, Volume 2 (2012)

Contents

For Authors 2012-09-10T15:03:30Z 2012-08-28T18:35:07Z tag:ncatlab.org,2012-08-28:Publications,For+Authors Urs Schreiber

\,\,\,\,\, (For AuthorsFor RefereesFor Editors)


This page contains submission instructions and other information for authors who wish to or have submitted work to the nPublications.

Contents

Submission procedure

An author who wishes to submit any material for publication to the Publications of the nnLab should go through the following steps:

  1. Create the material to be submitted in separate pages on the nLab in the way any nnLab pages are created (see nLab HowTo for details).

  2. Notify the editorial board of the nPublications about which pages in which precise version (as given in the edit-history of the entry) are to be submitted.

    (The precise version datum is mandatory for a submission to peer-review, as nnLab entries are subject to potential perpetual edits.)

It is also possible to submit material for review in the form of a PDF document or arXiv entry, with the understanding that upon acceptance, conversion to nLab page(s) is a precondition for publication.

The submitted material will go through a refereeing process as usual in mathematical journals: specialist referees will be chosen by the editors. If the material is accepted, publication of the material proceeds as follows:

  1. The accepted version of the nnLab entries is copied over to the write-protected nnPublications web. (The nnLab version of the submitted material remains in place, but is subject to perpetual further edits, as is all material on the nnLab.)

  2. Hyperlinks are added (ideally by nnPublications-staff, to the extent that such exists) to the nnLab version of the submitted article, pointing to the stable peer-reviewed version published in the nnPublications. Conversely, the nnPublications-version is equipped with a link back to the freely editable nnLab version.

Formatting for LaTeX submissions

If your submission is in the form of a LaTeX file instead of an nnLab page, then, once accepted, this file will be run through an script-software (provided thankfully by Andrew Stacey) which tries to construct an approproiate wiki-page from the source.

Of course this script is not a perfect converter, not yet at least. The output will typically need some further editing by humans (and in the present absence of any actual staff at the nnJournal, this means in fact that the author himself or herself will mostly have to look after this, with nnLab regulars usually offering help).

To smoothen this process, it helps to follow these hints:

  • do not use the TeX commands

    \bf \it \rm 

    etc. The script cannot or does not want to handle these. Instead use

    \textbf \textit \textrm

    in text mode and

    \mathbf \mathit \mathrm

    in math mode.

  • Beware that the script cannot translate xypic-diagrams. For ways of producing diagrams in the wiki software see at nLab:HowTo – How to draw diagrams and pictures.

(…)

Adding hyperlinks to the published article

A central point of nnJournal publications is that all the technical keywords of the text are to be hyperlinked to the corresponding entries on the nLab.

There is script, kindly provided by Andrew Stacey, that goes through the page, looks for words that match nlab page titles and tunrs them into hyperlinks.

It’s a perl script, so should run on any modern system (though if you have an old version of perl it might not. My system uses 5.12.3.)

You can download it: script and almost current list of nlab page names (includes redirects).

Call as:

wikilinks.pl -i <input file> -o <output file> [-n <page name file>] [-s <start line>]

The <page name file> defaults to nlab_pagenames in the current directory. Make sure that <output file> does not yet exist as it will be overwritten.

For each word the script finds, you have a choice: (a)ccept, (s)kip, (i)gnore name, (r)edo line, a(l)ways accept, or (q)uit. Hopefully these are all self-explanatory. It will always try to find the longest match, but matches have to be exact so a space in a different place will throw it out. There are some obvious things you should ignore: and, the, and similar. There are also some less obvious ones: section, for example (you’ll see why on the first match). If you quit early, it will have saved everything up to that line so you can resume later at that line by passing a line number to the script to start from (but it will overwrite the output file so you need to save the new bit into a new file and then concatenate them afterwards. Obviously, there are improvements that could be done to this script.)

FiorenzaMartinengo2012 - refereeing 2012-09-02T01:44:07Z 2012-08-31T01:00:51Z tag:ncatlab.org,2012-08-31:Publications,FiorenzaMartinengo2012+-+refereeing Urs Schreiber

This page documents the refereeing process that the publication FiorenzaMartinengo2012 went through.


Originally article submission, July 30th, 2011

(submission letter, FiorenzaMartinengo2012Submission.tex, FiorenzaMartinengo2012Submission.pdf)


Steering committee selects and contacts anonymous expert referee, August 5, 2011


Submission of slightly revised article, December 24th, 2011

(re-submission letter, FiorenzaMartinengo2012SubmissionB.tex FiorenzaMartinengo2012SubmissionB.pdf).


Referee is notified of revision, December 28th, 2011

(post)


May 2nd, 2012

Referee notifies steering committee that report will be further delayed due to circumstances beyond the referee’s control.


First referee report received June 28, 2012

Domenico Fiorenza, Elena Martinengo,A short note on infinity-groupoids and the period map for projective manifolds

In this note the authors explain how the modern homotopy-theoretic formulation of deformation theory sheds light on some classical results in the subject, particularly the period mapping relating deformations of complex structure on a projective manifold to variations of the Hodge structure on its cohomology.

The role of differential graded Lie algebras (dgla’s) in deformation theory was first emphasized by Nijenhuis and Richardson in the 1960’s, and was further elaborated upon by Deligne, Drinfeld, Kontsevich and others. Classically, deformation theory is given by a functor associating to a dgla a formal space. This formal space (more precisely, a functor from the category of Artinian algebras to Sets) describes the formal neighborhood of a point in the moduli space of structures encoded by the dgla as the solutions of the Maurer-Cartan equation. The modern approach extends this formulation in two directions:

(1) the deformation functor of a dgla should take values not in formal spaces but formal stacks (in higher-dimensional groupoids), i.e. functors from Artinian algebras to simplicial sets, rather than just sets. This captures the higher-order symmetry information encoded in the components of the dgla of low (negative) degrees.

(2) (not discussed in the present note) these formal stacks should be derived, i.e. their domain should be extended from Artinian algebras to their homotopical version, chain dg algebras (or, to go beyond characteristic 0, to simplicial or E-infinity algebras). This captures the information about the singularities of the Maurer-Cartan variety encoded in the high (positive) degree components of the dgla, and in particular sheds light on the obstruction theory.

In addition to capturing new information, this extended formulation often provides conceptual explanation of the classical phenomena which are recovered as the 0-truncation of the resulting homotopy types.

The authors approach is to first construct a certain map of sheaves of dgla’s on a manifold X arising in a simple way from the contraction of differential forms by vector fields (they first explain this in the abstract setting of “Cartan homotopies” of dgla’s). Taking the (extended) deformation functors turns this into a map of infinity-sheaves of formal stacks. Taking derived global sections and 0-truncating produces the classical period map.

Throughout the note the authors keep their presentation informal, steering clear of the technicalities of infinity-categories. Since they are not claiming any new results but merely offering a conceptual explanation of some old ones, this is acceptable. My only complaint is that the presentation sometimes gets a bit too informal, to the point of confusing the reader. I have provided some comments on the text below, which I feel should be addressed before the note is published.

  • the statement of the Theorem on p. 3 is incorrect. First, the category Art should be replaced by dgArt or equivalent, otherwise the degree >1\gt 1 part of the dgla will stay invisible. Second, the essential image of Def is not all formal stacks but only those that can arise from formal moduli problems, and there are conditions describing those. The precise statement can be found in DAG X (http://www.math.harvard.edu/~lurie/papers/DAG-X.pdf), Thm 0.0.13 (see also Warning 0.0.12). In fact, it seems this result, although generally of fundamental importance, plays no role in this note, as all the deformation problems considered explicitly arise from dgla’s.

  • On p.4 the authors introduce the “internal hom” dgla, Hom(g,h) and claim that MC(Hom(g,h) tensored with differential forms on simplices) is homotopy equivalent to Hom_oo(g,h). But since the authors do not exhibit any other model for Hom_oo besides MC(Hom(g,h)…), the statement seems meaningless. The authors should state explicitly what other model they are comparing MC(Hom..) with.

  • It is not clear what role is played by L-infinity morphisms in the note, in view of the fact that a Cartan homotopy is defined by conditions describing when a generally L-infinity morphism is actually a strict map of dgla’s. In fact, it is also unclear why one needs the “Lie derivative” l to be a strict map rather than L-infinity.

  • the authors occasionally apply constructions that generally only make sense for nilpotent dgla’s (such as exp,) to arbitrary dgla’s. The interpretation is probably that it is supposed to be applied not to a single dgla, but to the functor from Art to nilpotent dgla’s defined by this dgla. But it would be less confusing if this were spelled out explicitly.

  • Likewise, the authors often say “infinity-groupoid”, when they really mean a formal stack in infinity-groupoids, e.g. in footnote 4 on p.9.

  • It would help if the authors explained what they mean by “the differential of P” (since P is a map of formal spaces, its differential must be its value on the dual numbers modulo constants).


Reaction of the authors to the first referee report, June 29, 2012

Dear Referee,

Thanks a lot for your extremely careful report, and for the truly insightful remarks and suggestions it contains. We agree with all of them and are now planning to revise the note accordingly.

Below, we are sketching the kind of revision we have in mind for each of your comments, so that -should you enjoy having a look at it- you can eventually comment on this before we actually implement it into the new version of the note.

With our best regards,

Domenico and Elena

the statement of the Theorem on p. 3 is incorrect. First, the category Art should be replaced by dgArt or equivalent, otherwise the degree >1 part of the dgla will stay invisible. Second, the essential image of Def is not all formal stacks but only those that can arise from formal moduli problems, and there are conditions describing those. The precise statement can be found in DAG X (http://www.math.harvard.edu/~lurie/papers/DAG-X.pdf), Thm 0.0.13 (see also Warning 0.0.12).

We absolutely agree. Actually, the Artin algebras involved in the note are differential graded ones, but writing “(differential graded) local Artin algebra” instead of “differential graded local Artin algebra” on page 2 was definitely not a good idea, while denoting their category as Art instead of dgArt was so bad to be evil.

Concerning the statement of the Theorem on page 3, again, we absolutely agree. At the time we wrote that in the first arXiv version of the note we were not aware of the rigorous description of formal moduli problems. Then, after Lurie’s ICM talk, we added a pointer to that but did not upgarde the statement from its “so informal to be false” version to the correct one. Luckily, we will be able to do this now.

In fact, it seems this result, although generally of fundamental importance, plays no role in this note, as all the deformation problems considered explicitly arise from dgla’s.

Right. We are accordingly planning to deemphasize the role played by this result in the note. Its original role was to make it not surprising for the reader that the hom-space of (L_oo) morphisms between dglas could be realized as the simplicial set of Maurer-Cartan elements of a suitable dgla. But this is precisely one of thise points where our being informal ends up with being confusing, so we are now planning to state the equivalence between dglas and formal moduli problems in its correct way, as a result of fundamental importane on its own, but not as something palying an actual specific role in the note.

On p.4 the authors introduce the “internal hom” dgla, Hom(g,h) and claim that MC(Hom(g,h) tensored with differential forms on simplices) is homotopy equivalent to Hom_oo(g,h). But since the authors do not exhibit any other model for Hom_oo besides MC(Hom(g,h)…), the statement seems meaningless. The authors should state explicitly what other model they are comparing MC(Hom..) with.

Again we agree. The correct way to express what we had in mind is to define Hom_oo(g,h) via its simplicial model MC(Hom(g,h)…) with no reference to other models. Namely, we are planning to reduce the whole first part of Section 2 to a single sentence like “The hom-space Hom_oo(g,h) of morphisms between G and h in the (oo,1)-category of dglas is conveniently modelled as the simplicial set MC(Hom(g,h)…), where Hom(g,h) is the Chevalley-Eilenberg-type dgla defined as follows…”, and revise accordingly the rest of the section.

It is not clear what role is played by L-infinity morphisms in the note, in view of the fact that a Cartan homotopy is defined by conditions describing when a generally L-infinity morphism is actually a strict map of dgla’s. In fact, it is also unclear why one needs the “Lie derivative” l to be a strict map rather than L-infinity.

Indeed having so much space for L_oo morphisms in the note is a reminescence of a time when we were less uded to homotopy invariant constructions and were on the other hand used to deal with explict L_oo morphisms as tools for explicit computations. At the “transition towards homotopy” period the note was written it was nice to us to see the definition of Cartan homotopy as a condition describing when a certain L_oo morphism was actually strict. But now that we don’t look at strict L_oo morphisms as something cool anymore (and indeed there is nothing intrinsic in them) we are going to deemphasize this. Namely, we are now planning to give the plain definition of Cartan homotopy at the beginning of the section as something motivated by classical Cartan identities and to show directly (ore just say, since it is a one line computation) that the Lie derivative associated to a Cartan homotopy is a dgla morphism. And next just to observe the gauge equivalence e^{-i}0=l expressing the fact that l is a dgla morphism homotopy equivalent to zero.

the authors occasionally apply constructions that generally only make sense for nilpotent dgla’s (such as exp,) to arbitrary dgla’s. The interpretation is probably that it is supposed to be applied not to a single dgla, but to the functor from Art to nilpotent dgla’s defined by this dgla. But it would be less confusing if this were spelled out explicitly. -Likewise, the authors often say “infinity-groupoid”, when they really mean a formal stack in infinity-groupoids, e.g. in footnote 4 on p.9.

Again, absolutely right. We are going to spell these out explicitly.

It would help if the authors explained what they mean by “the differential of P” (since P is a map of formal spaces, its differential must be its value on the dual numbers modulo constants).

Here we could add a few lines to Section 1 as follows: after having mentioned formal moduli problems we could recall that the tangent space to a formal moduli problem P is P(k[epsilon]/(epsilon^2)) and that the differential of a morphism \phi: P –> P’ is \phi(k[epsilon]/(epsilon^2)): P(k[epsilon]/(epsilon^2)) –> P’(k[epsilon]/(epsilon^2)). By the way, we are working with unitary local Artin algebras but of such an algebra we actually take only the maximal ideal m_A: should we better work directly with nilpotent Artin algebras (and so define the tangent space to P as P((epsilon)/(epsilon^2)) instead?


Reaction of the referee, June 30, 2012

There’s not much to say, really, except I’m glad we’re on the same page. I’m sorry for not noticing the parenthesized “(differential graded)” on page 2 and assuming the authors meant discrete Artinian algebras throughout. There’s also a question of what dgArt should really mean: dg algebras whose underlying superalgebra is Artinian, or any infinity-category equivalent to that, eg. of dg algebras whose homology is Artinian (one could even take those algebras to be quasi-free). Regarding the last question, I guess it’s largely a matter of taste. I’m used to having all my algebras unital. But in this case they are also augmented, so it amounts to the same thing.


Submission of revised version by the authors, August 17, 2012: this is what is now the final version


Final referee reaction, August 23, 2012: The referee had no further comments and accepted the revised version.


For Editors 2012-08-28T18:38:16Z 2012-08-28T18:36:09Z tag:ncatlab.org,2012-08-28:Publications,For+Editors Urs Schreiber

\,\,\,\,\, (For AuthorsFor RefereesFor Editors)


This page contains instructions and other information for editors of the nPublications.

(…)

For Referees 2012-08-28T18:38:07Z 2012-08-28T18:35:54Z tag:ncatlab.org,2012-08-28:Publications,For+Referees Urs Schreiber

\,\,\,\,\, (For AuthorsFor RefereesFor Editors)


This page contains submission instructions and other information for referees of submission to the nPublications.

(…)

Leinster2011 2012-08-23T19:12:58Z 2011-11-02T09:09:28Z tag:ncatlab.org,2011-11-02:Publications,Leinster2011 Urs Schreiber

Click here to show links to nLab pages.

Tom Leinster, An informal introduction to topos theory, Publications of the nLab vol. 1 no. 1 (2011)

An informal introduction to topos theory

Tom Leinster

School of Mathematics and Statistics, University of Glasgow, Glasgow G12 8QW, UK; Tom.Leinster@glasgow.ac.uk. Supported by an EPSRC Advanced Research Fellowship.

Contents

Introduction

This short text is for readers who are confident in basic category theory but know little or nothing about toposes. It is based on some impromptu talks given to a small group of category theorists. I am no expert on topos theory. These notes are for people even less expert than me.

In keeping with the spirit of the talks, what follows is light on both detail and references. For the reader wishing for more, almost everything here is presented in respectable form in Mac Lane and Moerdijk’s very pleasant introduction to topos theory (1994). Nothing here is new, not even the expository viewpoint (very loosely inspired by Johnstone (2003)).

As a rough indication of the level of knowledge assumed, I will take it that you are totally comfortable with the Yoneda Lemma and the concept of cartesian closed category, but I will not assume that you know the definition of subobject classifier or of topos.

Section 1 explains the definition of topos. The remaining three sections discuss some of the connections between topos theory and other subjects. There are many more such connections than I will mention; I hope it is abundantly clear that these notes are, by design, a quick sketch of a large subject.

Section 2 is on connections between topos theory and set theory. There are two themes here. One is that, using the language of toposes, we can write down an axiomatization of sets that sticks closely to how sets are actually used in mathematics. This provides an appealing alternative to ZFC. The other, related, theme is that

a topos is a generalized category of sets.

Section 3 is on connections with geometry (in a broad sense); there the thought is that

a topos is a generalized space.

Section 4 is on connections with universal algebra:

a topos is a generalized theory.

What this means is that there is one topos embodying the concept of ‘ring’, another embodying the concept of ‘field’, and so on. This is the story of classifying toposes.

Sections 24 can be read in any order, except that ideally §3 (geometry) should come before §4 (universal algebra). You can read §4 without having read §3, but the price to pay is that the notion of ‘geometric morphism’—defined in §3 and used in §4—might seem rather mysterious.

Algebraic geometers beware: the word ‘topos’ is used by mathematicians in two slightly different senses, according to circumstance and culture. There are elementary toposes and Grothendieck toposes. Category theorists tend to use ‘topos’ to mean ‘elementary topos’ by default, although Grothendieck toposes are also important in category theory. But when an algebraic geometer says ‘topos’, they almost certainly mean ‘Grothendieck topos’ (what else?).

Grothendieck toposes are categories of sheaves. Elementary toposes are slightly more general, and the definition is simpler. They are what I will emphasize here. Grothendieck toposes are the subject of Section 3, and appear fleetingly elsewhere; but if you only want to learn about categories of sheaves, this is probably not the text for you.

Acknowledgements

I thank Andrei Akhvlediani, Eugenia Cheng, Richard Garner, Nick Gurski, Ignacio Lopez Franco and Emily Riehl for their participation and encouragement. Aspects of Section 4 draw on a vaguely similar presentation of vaguely similar material by Richard Garner. I thank the organizers of Category Theory 2010 for making the talks possible, even though they did not mean to: Francesca Cagliari, Eugenio Moggi, Marco Grandis, Sandra Mantovani, Pino Rosolini, and Bob Walters. I thank Filip Bár, Jon Phillips, Urs Schreiber, Mike Shulman, Alex Simpson, Danny Stevenson and Todd Wilson for suggestions and corrections. I am especially grateful to Todd Trimble for carefully reading an earlier version and suggesting many improvements.

The definition of topos

The hardest part of the definition of topos is the concept of subobject classifier, so I will begin there. For motivation, I will speak of ‘the category of sets’ (and functions). What exactly this means will be discussed in Section 2, but for now we proceed informally.

In the category of sets, inverse images are a special case of pullbacks. That is, given a map f:XYf\colon X \to Y of sets and a subset BYB \subseteq Y, we have a pullback square

f 1B B X f Y. \begin{matrix} \mathllap{f^{-1}} B & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & B \\ \begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg} \mathrlap{\array{\arrayopts{\align{bottom}}\; \begin{svg} <svg height='10.40001pt' viewBox='-0.2 -0.2 10.40001 10.40001 ' width='10.40001pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0,10.20001 ) scale(1,-1) translate(0,0.2 )'><g><g stroke='rgb(0.0%,0.0%,0.0%)'><g fill='rgb(0.0%,0.0%,0.0%)'><g stroke-width='0.4pt'><g><path d=' M 0.0 0.0 L 10.00002 0.0 L 10.00002 10.00002 ' style='fill: none;'/></g></g></g></g></g></g></svg>\end{svg}\\ \space{30}{10}{1}}} && \begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg} \\ X & \underset{\scriptsize f}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & Y. \end{matrix}

In particular, this holds when BB is a 1-element subset {y}\{ y\} of YY:

f 1{y} {y} X f Y. \begin{matrix} \mathllap{f^{-1}} \{ y\} & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & \{ y\} \\ \begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg} \mathrlap{\array{\arrayopts{\align{bottom}}\;\begin{svg} <svg height='10.40001pt' viewBox='-0.2 -0.2 10.40001 10.40001 ' width='10.40001pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0,10.20001 ) scale(1,-1) translate(0,0.2 )'><g><g stroke='rgb(0.0%,0.0%,0.0%)'><g fill='rgb(0.0%,0.0%,0.0%)'><g stroke-width='0.4pt'><g><path d=' M 0.0 0.0 L 10.00002 0.0 L 10.00002 10.00002 ' style='fill: none;'/></g></g></g></g></g></g></svg>\end{svg}\\ \space{30}{10}{1}}} && \begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg} \\ X & \underset{\scriptsize f}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & Y. \end{matrix}

There is no virtue in distinguishing between one-element sets, so we might as well write 11 instead of {y}\{ y\} ; then the inclusion {y}Y\{ y\} \hookrightarrow Y becomes the map 1Y1 \to Y picking out yYy \in Y, and we have a pullback square

f 1{y} ! 1 y X f Y. \begin{matrix} \mathllap{f^{-1}} \{ y\} & \overset{\scriptsize !}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & 1 \\ \array{\begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg}} \mathrlap{\array{\arrayopts{\align{bottom}}\space{0}{30}{10}\begin{svg} <svg height='10.40001pt' viewBox='-0.2 -0.2 10.40001 10.40001 ' width='10.40001pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0,10.20001 ) scale(1,-1) translate(0,0.2 )'><g><g stroke='rgb(0.0%,0.0%,0.0%)'><g fill='rgb(0.0%,0.0%,0.0%)'><g stroke-width='0.4pt'><g><path d=' M 0.0 0.0 L 10.00002 0.0 L 10.00002 10.00002 ' style='fill: none;'/></g></g></g></g></g></g></svg>\end{svg}}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} \mathrlap{y} \\ X & \underset{\scriptsize f}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & Y. \end{matrix}

Next consider characteristic functions of subsets. Fix a two-element set 2={𝗍,𝖿}2 = \{ &#x1D5CD;, &#x1D5BF;\} (‘true’ and ‘false’). Then for any set XX, the subsets of XX are in bijective correspondence with the functions X2X \to 2. In one direction, given a subset AXA \subseteq X, the corresponding function χ A:X2\chi _{A}\colon X \to 2 is defined by

χ A(x)={𝗍 ifxA 𝖿 ifxA \chi _{A}(x) = \begin{cases} &#x1D5CD;&\text{if}\; x \in A \\ &#x1D5BF;&\text{if}\; x \notin A \end{cases}

(xXx \in X). In the other, given a function χ:X2\chi \colon X \to 2, the corresponding subset of XX is χ 1{𝗍}\chi ^{-1}\{ &#x1D5CD;\} . To say that this latter process χχ 1{𝗍}\chi \mapsto \chi ^{-1}\{ &#x1D5CD;\} is a bijection is to say that for all AXA \subseteq X, there is a unique function χ:X2\chi \colon X \to 2 such that A=χ 1{𝗍}A = \chi ^{-1}\{ &#x1D5CD;\} . In other words: for all AXA \subseteq X, there is a unique function χ:X2\chi \colon X \to 2 such that

A ! 1 𝗍 X χ 2 \begin{matrix} A & \overset{\scriptsize !}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & 1 \\ \array{\begin{svg} <svg height='44pt' viewBox='-3.99994 -42.00003 7.99988 44.0 ' width='8pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <g transform='translate(0 2) scale(1 -1) translate(0 42)'> <g stroke='#000'> <g fill='#000'> <g stroke-width='.4pt'> <path d='m0-2.3v-37' fill='none'/> <g transform='matrix(0 1 -1 0 0 -2.3)'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <path d='m0 0h.42c.98 0 1.7-.93 1.7-1.7 0-.9-.7-1.7-1.7-1.7' fill='none'/> </g> </g> </g> <g transform='matrix(0 -1 1 0 0 -39)'> <g stroke-width='.4pt'> <g stroke-dasharray='none' stroke-dashoffset='0pt'> <g stroke-linecap='round'> <g stroke-linejoin='round'> <path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg> \end{svg}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} \mathrlap{&#x1D5CD;} \\ X & \underset{\scriptsize \chi }{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & 2 \end{matrix}

is a pullback square.

This property of sets can now be stated in purely categorical terms. We use \rightarrowtail to indicate a mono (== monomorphism == monic).

Definition

Let \mathcal{E} be a category with finite limits. A subobject classifier in \mathcal{E} is an object Ω\Omega together with a map 𝗍:1Ω&#x1D5CD;\colon 1 \to \Omega such that for every mono AmXA \stackrel{m}{\rightarrowtail } X in \mathcal{E}, there exists a unique map χ:XΩ\chi \colon X \to \Omega such that

A ! 1 m 𝗍 X χ Ω \begin{matrix} A & \overset{\scriptsize !}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & 1 \\ \mathllap{m}\array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"> <g transform="translate(0 2) scale(1 -1) translate(0 42)"> <g stroke="#000"> <g fill="#000"> <g stroke-width=".4pt"> <path d="m0-3.1v-36" fill="none"/> <g transform="matrix(0 1 -1 0 0 -3.1)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m2.8 3.2c-.2-1.2-2.4-3-3-3.2.6-.2 2.8-2 3-3.2" fill="none"/> </g> </g> </g> </g> </g> <g transform="matrix(0 -1 1 0 0 -39)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg>\end{svg}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} \mathrlap{&#x1D5CD;} \\ X & \underset{\scriptsize \chi }{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \Omega \end{matrix}

is a pullback square.

So, we have just observed that Set\mathbf{Set} has a subobject classifier, namely, the two-element set. In the general setting, we may write χ\chi as χ A\chi _{A} (or properly, χ m\chi _{m}) and call it the characteristic function of AA (or mm).

To understand this further, we need two lemmas.

Lemma

In any category, the pullback of a mono is a mono. That is, if

m m \begin{matrix} \cdot & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & \cdot \\ \mathllap{m'}\array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} \mathrlap{m} \\ \cdot & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & \cdot \end{matrix}

is a pullback square and mm is a mono, then so is mm'.

Lemma

In any category with a terminal object 11, every map out of 11 is a mono.

So, pulling 𝗍:1Ω&#x1D5CD;\colon 1 \to \Omega back along any map XΩX \to \Omega gives a mono into XX.

It will also help to know the result of the following little exercise (Johnstone (2003), A1.6.1). It says, roughly, that in the definition of subobject classifier, the fact that 11 is terminal comes for free.

Fact

Let \mathcal{E} be a category and let T𝗍ΩT \stackrel{&#x1D5CD;}{\rightarrowtail } \Omega be a mono in \mathcal{E}. Suppose that for every mono AmXA \stackrel{m}{\rightarrowtail } X in \mathcal{E}, there is a unique map χ:XΩ\chi \colon X \to \Omega such that there is a pullback square

A T m 𝗍 X χ Ω. \begin{matrix} A & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & T \\ \mathllap{m}\array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"> <g transform="translate(0 2) scale(1 -1) translate(0 42)"> <g stroke="#000"> <g fill="#000"> <g stroke-width=".4pt"> <path d="m0-3.1v-36" fill="none"/> <g transform="matrix(0 1 -1 0 0 -3.1)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m2.8 3.2c-.2-1.2-2.4-3-3-3.2.6-.2 2.8-2 3-3.2" fill="none"/> </g> </g> </g> </g> </g> <g transform="matrix(0 -1 1 0 0 -39)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg>\end{svg}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"> <g transform="translate(0 2) scale(1 -1) translate(0 42)"> <g stroke="#000"> <g fill="#000"> <g stroke-width=".4pt"> <path d="m0-3.1v-36" fill="none"/> <g transform="matrix(0 1 -1 0 0 -3.1)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m2.8 3.2c-.2-1.2-2.4-3-3-3.2.6-.2 2.8-2 3-3.2" fill="none"/> </g> </g> </g> </g> </g> <g transform="matrix(0 -1 1 0 0 -39)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg>\end{svg}} \mathrlap{&#x1D5CD;} \\ X & \underset{\scriptsize \chi }{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \Omega . \end{matrix}

Then TT is terminal in \mathcal{E}.

This leads to a second description of subobject classifiers. Let Mono()\mathbf{Mono}(\mathcal{E}) be the category whose objects are monos in \mathcal{E} and whose maps are pullback squares. Then a subobject classifier is exactly a terminal object of Mono()\mathbf{Mono}(\mathcal{E}).

Here is a third way of looking at subobject classifiers. Given a category \mathcal{E} and an object XX, a subobject of XX is officially an isomorphism class of monos AmXA \stackrel{m}{\rightarrowtail } X (where isomorphism is taken in the slice category /X\mathcal{E}/X). For example, when =Set\mathcal{E}= \mathbf{Set}, two monos

AmX,AmX A \stackrel{m}{\rightarrowtail } X, \quad A' \stackrel{m'}{\rightarrowtail } X

are isomorphic if and only if they have the same image; so subobjects of XX correspond one-to-one with subsets of XX. I say ‘officially’ because half the time people use ‘subobject of XX’ to mean simply ‘mono into XX’, or slip between the two meanings without warning. It is a harmless abuse of language, which I will adopt.

For XX \in \mathcal{E}, let Sub(X)\mathrm{Sub}(X) be the class of subobjects (in the official sense) of XX. Assume that \mathcal{E} is well-powered, that is, each Sub(X)\mathrm{Sub}(X) is a set rather than a proper class. Assume also that \mathcal{E} has pullbacks. By Lemma 2, every map XfYX \stackrel{f}{\to } Y in \mathcal{E} induces a map Sub(Y)f *Sub(X)\mathrm{Sub}(Y) \stackrel{f^{*}}{\to } \mathrm{Sub}(X) of sets, by pullback. This defines a functor Sub: opSet\mathrm{Sub}\colon \mathcal{E}^{\mathrm{op}}\to \mathbf{Set}.

Third description: a subobject classifier is a representation of this functor Sub\mathrm{Sub}.

This makes intuitive sense, since for Sub\mathrm{Sub} to be representable means that there is an object Ω\Omega \in \mathcal{E} satisfying

Sub(X)(X,Ω) \mathrm{Sub}(X) \cong \mathcal{E}(X, \Omega )

naturally in XX \in \mathcal{E}. In the motivating case of the category of sets, this directly captures the thought that subsets of a set XX correspond naturally to maps X{𝗍,𝖿}X \to \{ &#x1D5CD;, &#x1D5BF;\} .

Now we show that this is equivalent to the original definition. By the Yoneda Lemma, a representation of Sub: opSet\mathrm{Sub}\colon \mathcal{E}^{\mathrm{op}}\to \mathbf{Set} amounts to an object Ω\Omega \in \mathcal{E} together with an element 𝗍Sub(Ω)&#x1D5CD;\in \mathrm{Sub}(\Omega ) that is ‘generic’ in the following sense:

for every object XX \in \mathcal{E} and element mSub(X)m \in \mathrm{Sub}(X), there is a unique map χ:XΩ\chi \colon X \to \Omega such that χ *(𝗍)=m\chi ^{*}(&#x1D5CD;) = m.

In other words, a representation of Sub\mathrm{Sub} is a mono T𝗍ΩT \stackrel{&#x1D5CD;}{\rightarrowtail } \Omega in \mathcal{E} satisfying the condition in Fact 4. In other words, it is a subobject classifier.

Examples

 

  1. The primordial topos is Set\mathbf{Set}. It has special properties not shared by most other toposes. This is the subject of Section 2.

  2. For any set II, the category Set I\mathbf{Set}^{I} of II-indexed families of sets is a topos. Its subobject classifier is the constant family (2) iI(2)_{i \in I}, where 22 is a two-element set.

  3. For any group GG, the category Set G\mathbf{Set}^{G} of left GG-sets is a topos. Its subobject classifier is the set 22 with trivial GG-action.

  4. Encompassing all the previous examples, if 𝔸\mathbb{A} is any small category then the category 𝔸^=Set 𝔸 op\widehat{\mathbb{A}} = \mathbf{Set}^{\mathbb{A}^{\mathrm{op}}} of presheaves on 𝔸\mathbb{A} is a topos. We can discover what its subobject classifier must be by a thought experiment: if Ω\Omega is a subobject classifier then by the Yoneda Lemma,

    Ω(a)𝔸^(𝔸(,a),Ω)Sub(𝔸(,a)) \Omega (a) \cong \widehat{\mathbb{A}}( \mathbb{A}(-, a), \Omega ) \cong \mathrm{Sub}(\mathbb{A}(-, a))

    for all a𝔸a \in \mathbb{A}. So Ω(a)\Omega (a) must be the set of subfunctors of 𝔸(,a)\mathbb{A}(-, a); and one can check that defining Ω(a)\Omega (a) in this way does indeed give a subobject classifier. A subfunctor of 𝔸(,a)\mathbb{A}(-, a) is called a sieve on aa; it is a collection of maps into aa satisfying a certain condition.

  5. For any topological space SS, the category Sh(S)\mathbf{Sh}(S) of sheaves on SS is a topos. This is the subject of Section 3. Modulo a small lie that I will come back to there, the space SS can be recovered from the topos Sh(S)\mathbf{Sh}(S). Hence the class of spaces embeds into the class of toposes, and this is why toposes can be viewed as generalized spaces.

    Sheaves will be defined and explained in Section 3. To give a brief sketch: denote by Open(S)\mathbf{Open}(S) the poset of open subsets of SS; then a presheaf on the space SS is a presheaf on the category Open(S)\mathbf{Open}(S), and a sheaf on SS is a presheaf with a further property. I will consistently use ‘sheaf’ to mean what some would call ‘sheaf of sets’. A sheaf of groups, rings, etc. is the same as an internal group, ring etc. in Sh(S)\mathbf{Sh}(S).

  6. The category FinSet\mathbf{FinSet} of finite sets is a topos. Similarly, Set\mathbf{Set} can be replaced by FinSet\mathbf{FinSet} in all of the previous examples, giving toposes of finite GG-sets, finite sheaves, etc.

You might ask ‘why is the definition of topos what it is? Why that particular collection of axioms? What’s the motivation?’ I will not attempt to answer, except by explaining several ways in which the definition has been found useful. It is also worth noting that the topos axioms have many non-obvious consequences, giving toposes a far richer structure than most categories. For example, every map in a topos factorizes, essentially uniquely, as an epi followed by a mono. More spectacularly, the axioms imply that every topos has finite colimits. This can be proved by the following very elegant strategy, due to Paré (1974). For every topos \mathcal{E}, we have the contravariant power set functor P=Ω (): opP = \Omega ^{(-)}\colon \mathcal{E}^{\mathrm{op}}\to \mathcal{E}. It can be shown that PP is monadic. But monadic functors create limits, and \mathcal{E} has finite limits. Hence op\mathcal{E}^{\mathrm{op}} has finite limits; that is, \mathcal{E} has finite colimits.

Toposes and set theory

Here I will describe what makes ‘the’ category of sets special among all toposes, and explain why I just put ‘the’ in quotation marks. This is the stuff of revolution: it can completely change your view of set theory. It also provides an invaluable insight into topos theory as a whole.

We begin by listing some special properties of the topos Set\mathbf{Set}, using only the most commonplace assumptions about how sets and functions behave.

  1. The terminal object 11 is a separator (generator). That is, given maps XgfYX \underoverset{\quad g \quad }{f}{\rightrightarrows } Y in Set\mathbf{Set}, if fx=gxf \circ x = g \circ x for all x:1Xx\colon 1 \to X then f=gf = g.

    It is worth dwelling on what this says. Maps 1X1 \to X correspond to elements of XX, and we make no notational distinction between the two. Moreover, given an element xXx \in X and a map f:XYf\colon X \to Y, we can compose the maps

    1xXfY 1 \stackrel{x}{\to } X \stackrel{f}{\to } Y

    to obtain a map fx:1Yf \circ x\colon 1 \to Y, and this is the map corresponding to the element f(x)Yf(x) \in Y. (We might harmlessly write both fxf \circ x and f(x)f(x) as fxf x.) Thus, elements are a special case of functions, and evaluation is a special case of composition.

    The property above says that if f(x)=g(x)f(x) = g(x) for all xXx \in X then f=gf = g. In other words, a function is determined by its effect on elements.

  2. Write 00 for the initial object of Set\mathbf{Set} (the empty set). Then 010 \ncong 1. Equivalently, Set\mathbf{Set} is not equivalent to the terminal category 𝟙\mathbb{1}.

    A topos satisfying properties 1 and 2 is called well-pointed.

  3. This property says, informally, that there is a set consisting of the natural numbers.

    What are the ‘the natural numbers’, though? One way to get at an answer is to use the principle that sequences can be defined recursively. That is, given a set XX, an element xXx \in X, and a map r:XXr\colon X \to X, there is a unique sequence (x n) n=0 (x_{n})_{n = 0}^{\infty } in XX such that

    (1)x 0=x,x n+1=r(x n)(n). x_{0} = x, \quad x_{n + 1} = r(x_{n}) \quad (n \in \mathbb{N}).

    A sequence (x n) n=0 (x_{n})_{n = 0}^{\infty } in XX is just a map f:Xf\colon \mathbb{N}\to X, and if we write s:s\colon \mathbb{N}\to \mathbb{N} for the function nn+1n \mapsto n + 1 (‘successor’), then (1) says exactly that the diagram

    (2)1 0 s x f f X r X \begin{matrix} 1 & \overset{0}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \mathbb{N} & \stackrel{s}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \mathbb{N} \\ & \mathrlap{\quad x} \array{\begin{svg} <svg height='44pt' viewBox='-5.99991 -42.00003 51.99988 44.0 ' width='52pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 2) scale(1 -1) translate(0 42)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0l39-39' fill='none'></path><g transform='matrix(.71 -.71 .71 .71 39 -39)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'></path></g></g></g></g></g></g></g></g></g></svg>\end{svg}} &\array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}}\mathrlap{f} & &\array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}}\mathrlap{f} \\ &&X &\stackrel{r}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} &X \end{matrix}

    commutes.

    Definition

    Let \mathcal{E} be a category with a terminal object, 11. A natural numbers object in \mathcal{E} is a triple (N,0,s)(N, 0, s), with NN \in \mathcal{E}, 0:1N0\colon 1 \to N, and s:NNs\colon N \to N, that is initial as such: for any triple (X,x,r)(X, x, r) of the same type, there is a unique map f:NXf\colon N \to X such that (2) commutes (with NN in place of \mathbb{N}).

    Property 3 is, then, that Set\mathbf{Set} has a natural numbers object.

  4. Epis split. That is, for any epimorphism (surjection) e:XYe\colon X \to Y in Set\mathbf{Set}, there exists a map m:YXm\colon Y \to X such that em=1 Ye \circ m = 1_{Y}. The splitting mm chooses for each yYy \in Y an element of the nonempty set e 1{y}e^{-1}\{ y\} . The existence of such splittings is precisely the Axiom of Choice. Generally, a category is said to satisfy the Axiom of Choice (or to ‘have Choice’) if epis split.

In summary,

sets and functions form a well-pointed topos with natural numbers object and Choice.

The category of sets has many other elementary properties (such as the fact that the subobject classifier has exactly two elements), but they are all consequences of the properties just mentioned.

But what is this thing called ‘the category of sets’? What do we have to assume about sets in order to prove that these properties hold?

Many mathematicians do not like to be bothered with such questions, because they know that the standard answer will be something like ‘sets are anything satisfying the axioms of ZFC’—and they feel that ZFC is irrelevant to what they do, and prefer not to hear about it.

The standard answer is valid, in the sense that for every model of ZFC, there is a resulting category of sets satisfying the properties above. But it may seem irrelevant, because at no point in establishing the properties did it feel necessary to call on an axiom system: all the properties are suggested directly by the naive imagery of a set as a bag of dots.

There is, however, another type of answer—and this was Lawvere’s radical idea. It is this:

we take the properties above as our axioms on sets.

In other words, we do away with ZFC entirely, and ask instead that sets and functions form a well-pointed topos with natural numbers object and Choice. ‘The’ category of sets is any category satisfying these axioms. In fact we should say a category of sets, since there may be many different such categories, as we shall see.

This is Lawvere’s Elementary Theory of the Category of Sets (ETCS), stated in modern language. (See Lawvere (1964), or Lawvere and Rosebrugh (2003) for a good expository account.) It is nearly fifty years old, but still has not gained the currency it deserves, for reasons on which one can speculate.

Digression

You might be thinking that this is circular: that this axiomatization of sets depends on the notion of category, and the notion of category depends on some notion of collection or set. But in fact, ETCS does not depend on the general notion of category. It can be stated without using the word ‘category’ once.

To see this, we need to back up a bit. The ZFC axiomatization of sets looks, informally, like this:

People seeing this (or the formal version) often ask certain questions. What does ‘some things’ mean? Do you mean that there is a set of sets? (No.) What exactly is meant by ‘binary relation’? (It means that for each set XX and set YY, the statement ‘XYX \in Y’ is deemed to be either true or false.) What do you mean, ‘deemed’? Etc. This is not a logic course, and I will not attempt to answer the questions except to say that there is an assumed common understanding of these terms. To hide behind jargon, ZFC is a first-order theory.

The ETCS axiomatization of sets looks like this:

  • there are some things called ‘sets

  • for each set XX and set YY, there are some things called ‘functions from XX to YY

  • for each set XX, set YY and set ZZ, there is a binary operation assigning to each pair of functions

    f:XY,g:YZ f\colon X \to Y, \quad g\colon Y \to Z

    a function gf:XZg \circ f\colon X \to Z

  • some axioms hold.

You can ask the same kind of logical questions as for ZFC—what exactly is meant by ‘binary operation’? etc.—which again I will not attempt to answer. The difficulties are no worse than for ZFC, and again, in the jargon, ETCS is a first-order theory.

Stated in this way, the ETCS axioms begin by saying that composition is associative and has identities (so that sets, functions and composition of functions define a category); then they say that binary products and equalizers of sets exist, and there is a terminal set (so that the category of sets has finite limits); and so on, until we have said that sets and functions form a well-pointed topos with natural numbers object and Choice. You can do it in about ten axioms.

Here ends the digression.

ZFC axiomatizes sets and membership, whereas ETCS axiomatizes sets and functions. Anything that can be expressed in one language can be expressed in the other: in the usual implementation of ZFC, a function XYX \to Y is defined as a suitable subset of X×YX \times Y, and in ETCS, an element of XX is defined as a function from the terminal set to XX. But an advantage of the categorical approach is that it avoids the chains of elements of elements of elements that are so important in traditional set theory, yet seem so distant from most of mathematics.

ZFC is slightly stronger than ETCS. ‘Stronger’ means that everything that can be deduced about sets from the ETCS axioms can also be deduced in ZFC, but not vice versa. ‘Slightly’ is meant in a sociological sense. I believe it has been said that the mathematics in an ordinary undergraduate syllabus (excluding, naturally, any course in ZFC) makes no more assumptions about sets than are made by ETCS. If that is so, it must also be the case that for many mathematicians, nothing in their entire research career requires more than ETCS.

The technical relationship between ZFC and ETCS is well understood. It is known exactly which fragment of ZFC is equivalent to ETCS (namely, ‘bounded’ or ‘restricted’ Zermelo with Choice; see Mac Lane and Moerdijk (1994)). It is also known what needs to be added to ETCS in order to obtain a system of equal strength to ZFC. This extra ingredient is an axiom scheme (a countably infinite family of axioms) that set theorists in the traditional mould would call Replacement, and category theorists would call a form of cocompleteness. It says, informally, that given any set II and family (X i) iI(X_{i})_{i \in I} of sets specified by a first-order formula, the coproduct iIX i\sum _{i \in I} X_{i} exists. The existence of this coproduct is expressed by saying that there exist a set XX and a map p:XIp\colon X \to I (to be thought of as the projection iIX iI\sum _{i \in I} X_{i} \to I) such that for each iIi \in I, the inverse image p 1{i}p^{-1}\{ i\} is isomorphic to X iX_{i}. See Section 8 of McLarty (2004) for details.

Topos theory therefore provides a different viewpoint on set theory. Let us take a brief look from this new viewpoint at a famous theorem of set theory: that the Continuum Hypothesis is independent of the usual set-theoretic axioms, as proved by Gödel and Cohen.

Temporarily, let us say that a ‘category of sets’ is a well-pointed topos with natural numbers object and Choice, satisfying the axiom scheme of Replacement. A category of sets is said to satisfy the Continuum Hypothesis if for all objects XX,

there exist monosNX2 N XNorX2 N. \begin{aligned} &\text{there exist monos}\; N \rightarrowtail X \rightarrowtail 2^{N} \\ \implies &X \cong N \;\text{or}\; X \cong 2^{N}. \end{aligned}

(As usual, NN denotes the natural numbers object; 22 is the subobject classifier.) Stated categorically, the theorem is this: given any category of sets, you can build one that satisfies the Continuum Hypothesis and one that does not. This is only a rephrasing of the standard statement, but if you are more at home with the term ‘category’ than with ‘model of a first-order theory’, you might find it less mysterious.

So far we have seen the benefits of viewing the/a category of sets as a special topos. But the other way round, there are great benefits to viewing a topos as a generalized category of sets. For example, we might view Set \mathbf{Set}^{\mathbb{N}} as the category of sets varying through (discrete) time. The set of human beings alive today is an object of Set \mathbf{Set}^{\mathbb{N}}: as the meaning of ‘today’ changes, the set changes. A sheaf can similarly be understood as a set varying through space.

People (especially Lawvere) sometimes refer to the category of sets as the (or a) topos of constant sets, to contrast it with toposes of variable sets. There are also toposes whose objects can informally be thought of as ‘cohesivesets, which means the following. In an ordinary set, the points have no relation or attachment to each other: they do not ‘cohere’. But a cohesive set carries something like a topology or smooth structure, so that the points are in some sense stuck together. For example, there are toposes of smooth spaces, which are the setting for synthetic differential geometry. From this point of view, the category of ordinary sets is extreme among all toposes: its objects are sets with no variation or cohesion at all.

Viewing the objects of a topos as generalized sets is much more than a useful mental technique. In fact, it is valid to use set-like language and reasoning in any topos, provided that we stick to certain rules. This language is called the ‘internal language’ of the topos.

Many of the central ideas of topos theory are simple, but that simplicity can easily be obscured by the richness of structure available in a topos. Such is the case for the internal language. I will therefore describe the idea in a much more basic setting.

First let \mathcal{E} be any category whatsoever, and let XX be an object of \mathcal{E}. A generalized element of XX is simply a map in \mathcal{E} with codomain XX. A generalized element x:SXx\colon S \to X may be said to be of shape SS, or to be an SS-element of XX. In the special case that SS is terminal, SS-elements are called global elements. (See Example 9(3) for a hint on the reason for the name.) In the category of sets, the global elements are the ordinary elements, but in other categories, the global elements might be very uninteresting: consider the category of groups, for instance.

Given a map f:XYf\colon X \to Y in \mathcal{E}, any generalized element xx of XX gives rise to a generalized element fxf x of YY. This is the composite fxf \circ x, but can also be thought of as ‘f(x)f(x)’: see the remarks on property 1 at the beginning of this section. For maps XgfYX \underoverset{\quad g \quad }{f}{\rightrightarrows } Y, we have

f=gfx=gxfor all generalized elementsxofX. f = g \iff f x = g x \;\text{for all generalized elements}\; x \;\text{of}\; X.

(Proof of \Leftarrow : take x=1 Xx = 1_{X}.) This is emphatically not true if we replace ‘generalized’ by ‘global’: again, consider groups.

This language of generalized elements is the internal language of the category. It fits well with ordinary categorical terminology and notation?. For example, let \mathcal{E} be a category with finite products. In the internal language, the definition of product reads, informally: an SS-element of X×YX \times Y consists of an SS-element of XX together with an SS-element of YY. Apart from the ‘SS-’ prefixes, this is identical to the ordinary description of the cartesian product of sets XX and YY. And in standard categorical notation?, the map SX×YS \to X \times Y with components x:SXx\colon S \to X and y:SYy\colon S \to Y is denoted by (x,y)(x, y), thus extending the set-theoretic notation? for a (global) element of a cartesian product.

To see why the internal language is useful, consider, for instance, internal groups in a finite product category \mathcal{E}. A group in \mathcal{E} is an object XX together with maps

m:X×XX,i:XX,e:1X m\colon X \times X \to X, \quad i\colon X \to X, \quad e\colon 1 \to X

satisfying some axioms. Those axioms are usually expressed as commutative diagrams, which have been obtained by translating the classical axioms into diagrammatic form. But there is no need to translate them: the classical axioms can simply be repeated verbatim and interpreted as statements about generalized elements. This is equivalent. For example, it is easy to show that the commutative diagram for associativity is equivalent to the statement that

(3)m(m(x,y),z)=m(x,m(y,z)) m(m(x, y), z) = m(x, m(y, z))

for all generalized elements x,y,zx, y, z of XX of the same shape. (They have to be the same shape in order for expressions such as (x,y)(x, y) to make sense.) And just as for ordinary elements in Set\mathbf{Set}, there is no harm in writing xyxy instead of m(x,y)m(x, y), and similarly x 1x^{-1} instead of i(x)i(x).

More valuably still, proofs written down in the classical set-theoretic scenario will actually be valid in an arbitrary finite product category \mathcal{E}, as long as whatever was said about elements in Set\mathbf{Set} is also true for generalized elements in \mathcal{E}. For example, whenever XX is a group in Set\mathbf{Set} and x,y,aXx, y, a \in X, we have

(4)xa=yax=y. x a = y a \implies x = y.

Proof:

xa=ya (xa)a 1=(ya)a 1x(aa 1)=y(aa 1) xe=yex=y. \begin{aligned} x a = y a & \implies (x a)a^{-1} = (y a)a^{-1} \implies x(a a^{-1}) = y(a a^{-1}) \\ & \implies x e = y e \implies x = y. \end{aligned}

We can immediately conclude that the implication (4) holds whenever XX is a group in an arbitrary finite product category \mathcal{E} and x,y,ax, y, a are generalized elements of XX of the same shape. Indeed, each step in the proof is an application of an axiom such as (3) valid in the general setting.

The internal language is a massively labour-saving device. To prove that an equation valid in ordinary groups is also valid for internal groups, you merely need to cast an eye over the proof and convince yourself that it holds for generalized elements too. In contrast, try proving the internal version of the equation

(5)y 1x 1=(xy) 1 y^{-1} x^{-1} = (x y)^{-1}

by diagrammatic methods. First it has to be stated diagrammatically. It says that the diagram

X×X symX×X i×i X×X m m X i X \begin{matrix} X \times X & \overset{\text{sym}}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}}\; X \times X\; \overset{i \times i}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & X \times X \\ \mathllap{\quad m} \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} \mathrlap{m} \\ X & \underset{i}{\begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 132.0 7.99988 ' width='132pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h127' fill='none'/><g transform='matrix(1 0 0 1 127 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & X \end{matrix}

commutes. Then it has to be proved, by filling the inside of this diagram with instances of the diagrams encoding the group axioms. (It seems to need at least ten or so inner diagrams.) But once you have an elementwise proof, all this effort is unnecessary. And the example (5) chosen was very simple: for more complex statements, the benefits of the internal language become clearer still.

The internal language of toposes is similar to that of finite product categories, but much richer. As well as being able to form pairs (x,y)(x, y) of generalized elements, we can take generalized elements of exponentials Y XY^{X} (to be thought of as families of maps XYX \to Y), form subobjects such as

{xX|fx=gx} \{ x \in X \:|\:f x = g x \}

(the equalizer of XgfYX \underoverset{\quad g \quad }{f}{\rightrightarrows } Y), and so on. Almost anything that can be expressed or proved in the category of sets can be reproduced in an arbitrary topos. The only sticking points are the law of the excluded middle and the axiom of choice. Any proof that avoids those—any constructive proof, in a sense that can be made precise—generalizes to an arbitrary topos.

Phrases with more or less the same meaning as ‘internal language’ are ‘Mitchell--Bénabou language’ and ‘internal logic’. See, for instance, Mac Lane and Moerdijk (1994) or Johnstone (2003). There you can also find more spectacular applications of topos theory to set theory, including topics such as forcing.

Toposes and geometry

This section covers concepts such as sheaf, geometric morphism (map of toposes), Grothendieck topos, and locale. But the most important thing I want to explain is how and why geometry has inspired so much of topos theory.

Sheaves

Let XX be a topological space. (Following tradition, I will switch from my previous convention of using XX to denote an object of a topos.) Write Open(X)\mathbf{Open}(X) for its poset of open subsets. A presheaf on XX is a functor F:Open(X) opSetF\colon \mathbf{Open}(X)^{\mathrm{op}}\to \mathbf{Set}. It assigns to each open subset UU a set F(U)F(U), whose elements are called sections over UU (for reasons to be explained). It also assigns to each open VUV \subseteq U a function F(U)F(V)F(U) \to F(V), called restriction from UU to VV and denoted by ss| Vs \mapsto s\vert _{V}. I will write Psh(X)\mathbf{Psh}(X) for the category of presheaves on XX.

Examples

 

  1. Let F(U)={continuous functionsU}F(U) = \{ \text{continuous functions}\; U \to \mathbb{R}\} ; restriction is restriction.

  2. The same, but with ‘bounded’ in place of ‘continuous’.

Examples 1 and 2 are qualitatively different: continuity is a local property, but boundedness is not. This difference can be captured by asking the following question. Let (U i) iI(U_{i})_{i \in I} be a family of open subsets of XX, and take, for each iIi \in I, a section s iF(U i)s_{i} \in F(U_{i}). Might there be some sF( iIU i)s \in F(\bigcup _{i \in I} U_{i}) such that s| U i=s is\vert _{U_{i}} = s_{i} for all ii?

For this to stand a chance of being true, functoriality demands that the sections s is_{i} must satisfy a ‘matching condition’: s i| U iU j=s j| U iU js_{i}\vert _{U_{i} \cap U_{j}} = s_{j}\vert _{U_{i} \cap U_{j}} for all ii and jj. A sheaf is a presheaf such that for every family (U i) iI(U_{i})_{i \in I} of open sets and every matching family (s i) iI(s_{i})_{i \in I}, there is a unique sF( iIU i)s \in F(\bigcup _{i \in I} U_{i}) such that s| U i=s is\vert _{U_{i}} = s_{i} for all iIi \in I.

Examples

 

  1. The first example above, with continuous functions, is a sheaf. The proof can be split into two parts. Given (U i)(U_{i}) and (s i)(s_{i}), there is certainly a unique function s:U is\colon \bigcup U_{i} \to \mathbb{R} (continuous or not) such that s| U i=s is\vert _{U_{i}} = s_{i} for all ii. The question now is whether ss is continuous; and because continuity is a local property, it is.

  2. The second example above, with bounded functions, is not a sheaf (for a general space XX). This is because boundedness is not a local property.

  3. The sheaf of continuous real-valued functions is rather floppy, in the sense that there are usually many ways to extend a continuous function from a smaller set to a larger one. Often people consider sheaves made up of holomorphic or rational functions, which are much more rigid: there are typically few or no ways to extend. It is quite normal for there to be no global sections (sections over XX) at all.

  4. Take any continuous map YpXY \stackrel{p}{\to } X of topological spaces (which can be thought of as a kind of bundle over XX). Then there arises a sheaf FF on XX, in which F(U)F(U) is the set of continuous maps s:UYs\colon U \to Y such that the triangle on the left commutes:

    \array{ \arrayopts{\rowalign{bottom}} \begin{matrix} && Y \\ &\mathllap{\quad s} \array{\begin{svg} <svg height='52pt' viewBox='-1.99997 -5.99991 44.0 51.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 46) scale(1 -1) translate(0 6)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0l39 39' fill='none'/><g transform='matrix(.71 .71 -.71 .71 39 39)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}}\mathrlap{p} \\ U & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'>

\end{svg} & X \end{matrix}& \qquad \qquad \qquad &\begin{svg}

Layer 1 U U X X p p Y Y s s U U

\end{svg}}

Suchan \array{ \arrayopts{\rowalign{bottom}} \begin{matrix} && Y \\ &\mathllap{\quad s} \array{\begin{svg} <svg height='52pt' viewBox='-1.99997 -5.99991 44.0 51.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 46) scale(1 -1) translate(0 6)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0l39 39' fill='none'/><g transform='matrix(.71 .71 -.71 .71 39 39)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg}} & \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"><g transform="translate(0 2) scale(1 -1) translate(0 42)"><g stroke="#000"><g fill="#000"><g stroke-width=".4pt"><path d="m0 0v-39" fill="none"/><g transform="matrix(0 -1 1 0 0 -39)"><g stroke-width=".4pt"><g stroke-dasharray="none" stroke-dashoffset="0pt"><g stroke-linecap="round"><g stroke-linejoin="round"><path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/></g></g></g></g></g></g></g></g></g></svg>\end{svg}}\mathrlap{p} \\ U & \begin{svg} Such an

There is also an abstract categorical explanation of where the concept of sheaf comes from. Fix a space XX. We have a functor

I:Open(X)TopSp/X I\colon \mathbf{Open}(X) \to \mathbf{TopSp}/X

where TopSp\mathbf{TopSp} is the category of topological spaces, TopSp/X\mathbf{TopSp}/X is the slice category, and I(U)=(UX)I(U) = (U \hookrightarrow X). This functor II embodies the simple thought that an open subset of a topological space can be treated as a space in its own right. We now apply to II two very general categorical constructions, from which the sheaf concept will appear automatically.

First, purely because the domain of II is small and the codomain has small colimits, there is an induced adjunction

Psh(X)=Set Open(X) opHom(I,)ITopSp/X. \mathbf{Psh}(X) = \mathbf{Set}^{\mathbf{Open}(X)^{\mathrm{op}}} \underoverset{Hom(I,-)}{-\otimes I}{\underoverset{\leftarrow }{\rightarrow }{\quad \perp \quad }} \mathbf{TopSp}/X.

The right adjoint is given by

(Hom(I,Y))(U)=TopSp/X(I(U),Y) (\mathrm{Hom}(I, Y))(U) = \mathbf{TopSp}/X \, (I(U), Y)

where Y=(Y p X)TopSp/XY = \left ( \begin{matrix} Y\\ \downarrow \mathrlap{p}\\ X\end{matrix}\;\right ) \in \mathbf{TopSp}/X and UOpen(X)U \in \mathbf{Open}(X). This is, in fact, the process described in Example 9(4): the sheaf FF defined there is Hom(I,Y)\mathrm{Hom}(I, Y). The left adjoint can be described as a coend or colimit: for FPsh(X)F \in \mathbf{Psh}(X),

FI= UF(U)×I(U)=((lim U,sU)X) F \otimes I = \int ^{U} F(U) \times I(U) = \Bigl ( \bigl ( \displaystyle \lim _{\rightarrow U, s}\, U \bigr ) \to X \Bigr )

where the colimit is over all UOpen(X)U \in \mathbf{Open}(X) and sF(U)s \in F(U), and the map from the colimit to XX is the canonical one.

Second, every adjunction restricts canonically to an equivalence between full subcategories: one consists of the objects at which the unit of the adjunction is an isomorphism, and the other of the objects at which the counit is an isomorphism. Write the equivalence obtained from the adjunction above as

Sh(X)Et(X). \mathbf{Sh}(X) \underoverset{\leftarrow }{\rightarrow }{\quad \simeq \quad } \mathbf{Et}(X).

It can be shown that this Sh(X)\mathbf{Sh}(X) is the same category of sheaves as before. In this way, the notion of sheaf arises canonically from the very simple functor I:Open(X)TopSp/XI\colon \mathbf{Open}(X) \to \mathbf{TopSp}/X. The notion of étale bundle also arises canonically: étale bundles over XX are (by definition, if you like) the objects of Et(X)\mathbf{Et}(X). Among other things, this equivalence shows that every sheaf is of the form described in Example 9(4). See Mac Lane and Moerdijk (1994) for details.

One way or another, we have the category Sh(X)\mathbf{Sh}(X) of sheaves on XX. It is a topos. Its subobject classifier Ω\Omega is given by

Ω(U)={open subsets of U}. \Omega (U) = \{ \text{open subsets of } U \} .

The crucial fact about Sh(X)\mathbf{Sh}(X) is that—modulo a small lie that I will repair later—

XX can be recovered from Sh(X)\mathbf{Sh}(X).

So the class of topological spaces embeds into the class of toposes. We can think of toposes as generalized spaces.

A common technique in topos theory is to take a concept from topology or geometry and extend it to toposes. For example, suppose you hear someone talking about ‘connected toposes’. You may have no idea what one is, but you can bet that the definition has been obtained by determining what property of the topos Sh(X)\mathbf{Sh}(X) corresponds to connectedness of the space XX, then taking that as the definition of connectedness for all toposes.

The next few subsections are all examples of this generalization process.

Geometric morphisms

So far I have said nothing about maps between toposes. There is an obvious candidate for what a map of toposes should be: a functor preserving finite limits, exponentials, and subobject classifiers. Such a functor is called a logical morphism. They have a part to play, but there is another notion of map of toposes that has been found much more useful. It can be derived by generalizing from topology.

Every map f:XYf\colon X \to Y in TopSp\mathbf{TopSp} induces an adjunction

(6)Sh(X)f *f *Sh(Y). \mathbf{Sh}(X) \underoverset{f_{*}}{f^{*}}{\underoverset{\rightarrow }{\leftarrow }{\quad \scriptsize \scriptsize \bot \quad }} \mathbf{Sh}(Y).

This is not obvious. The right adjoint f *f_{*} is easy to construct—

(f *F)(V)=F(f 1V) (f_{*} F)(V) = F(f^{-1} V)

(FSh(X)F \in \mathbf{Sh}(X), VOpen(Y)V \in \mathbf{Open}(Y))—but the left adjoint f *f^{*} is harder. It can be made easy by invoking the equivalence between sheaves and étale bundles; but I will not go into that, or give any other description of f *f^{*}.

It is a fact that f *f^{*} preserves finite limits. It is also a fact (modulo the usual small lie) that there is a natural correspondence between continuous maps XYX \to Y and adjunctions (6) in which the left adjoint preserves finite limits. So now we know what continuous maps look like in topos-theoretic terms. We duly generalize:

Definition

Let \mathcal{E} and \mathcal{F} be toposes. A geometric morphism f:f\colon \mathcal{E}\to \mathcal{F} is an adjunction

f *f * \mathcal{E} \underoverset{f_{*}}{f^{*}}{\underoverset{\rightarrow }{\leftarrow }{\quad \scriptsize \scriptsize \bot \quad }} \mathcal{F}

in which the left adjoint f *f^{*} preserves finite limits. (People often say ‘left exact left adjoint’.) The right adjoint f *f_{*} is called the direct image part of ff, and f *f^{*} is the inverse image part.

I will write Topos\mathbf{Topos} for the category of toposes and geometric morphisms. (Really it’s a 2-category, in an obvious way.) By construction, we have a functor

Sh:TopSpTopos \mathbf{Sh}\colon \mathbf{TopSp}\to \mathbf{Topos}

which is (2-categorically) full and faithful, modulo the usual small lie.

Examples

 

  1. Every functor f:𝔻f\colon \mathbb{C} \to \mathbb{D} induces a string of adjoint functors

    ^f *f *f !𝔻^ \widehat{\mathbb{C}} \, \underoverset{\underoverset{f_{*}}{\perp }{\rightarrow }}{ \underoverset{\perp }{f_{!}}{\rightarrow }}{ \longleftarrow {\scriptsize f^{*}} - } \, \widehat{\mathbb{D}}

    between presheaf categories. Here f *=ff^{*} = -\circ f, and f !f_{!} and f *f_{*} are left and right Kan extension along ff, respectively. Since f *f^{*} has a left adjoint, it preserves limits. Hence (f *,f *)(f^{*}, f_{*}) is a geometric morphism ^𝔻^\widehat{\mathbb{C}} \to \widehat{\mathbb{D}}.

  2. It turns out that, for any topological space XX, the inclusion Sh(X)Psh(X)\mathbf{Sh}(X) \hookrightarrow \mathbf{Psh}(X) has a finite-limit-preserving left adjoint. It is called sheafification or the associated sheaf functor. So the inclusion of sheaves into presheaves is a geometric morphism.

    Since Sh(X)\mathbf{Sh}(X) is a full subcategory, the inclusion is full and faithful; and for totally general reasons, this is equivalent to the counit of the adjunction being an isomorphism. In other words, sheafifying a sheaf does not change it.

Points

Let us generalize another concept of topology. The points of a topological space XX correspond to the maps 1X1 \to X (where 11 is the one-point space), which correspond to the geometric morphisms Sh(1)Sh(X)\mathbf{Sh}(1) \to \mathbf{Sh}(X). But Sh(1)=Psh(1)=Set\mathbf{Sh}(1) = \mathbf{Psh}(1) = \mathbf{Set}, so we make the following definition.

Definition

A point of a topos \mathcal{E} is a geometric morphism Set\mathbf{Set}\to \mathcal{E}.

Embeddings and Grothendieck toposes

For any subspace YY of a space XX, the inclusion YXY \hookrightarrow X is an embedding, that is, a homeomorphism to its image. It can be shown that a map f:YXf\colon Y \to X of spaces is an embedding if and only if the direct image part f *f_{*} of the corresponding geometric morphism f:Sh(Y)Sh(X)f\colon \mathbf{Sh}(Y) \to \mathbf{Sh}(X) is full and faithful. So, as usual, we generalize:

Definition

A geometric morphism f:f\colon \mathcal{F}\to \mathcal{E} is an embedding (or inclusion) if the direct image functor f *f_{*} is full and faithful.

We then say that \mathcal{F} is a subtopos of \mathcal{E}. At least, this is the right thing to say up to equivalence. Perhaps we should reserve that word for when \mathcal{F} is actually a (full) subcategory of \mathcal{E} and f *f_{*} is the inclusion \mathcal{F}\hookrightarrow \mathcal{E}, rather than allowing f *f_{*} to be any old full and faithful functor. But a full and faithful functor induces an equivalence to its image, so it makes no real difference.

Probably the easiest toposes are the presheaf toposes: those equivalent to ^=Set op\widehat{\mathbb{C}} = \mathbf{Set}^{\mathbb{C}^{\mathrm{op}}} for some small category \mathbb{C}. So maybe subtoposes of presheaf toposes are relatively easy too. They have a special name:

Definition

A topos is Grothendieck if it is (equivalent to) a subtopos of some presheaf topos.

For instance, we saw in Example 11(2) that Sh(X)\mathbf{Sh}(X) is a subtopos of Psh(X)=Open(X)^\mathbf{Psh}(X) = \widehat{\mathbf{Open}(X)}, for any topological space XX. Hence Sh(X)\mathbf{Sh}(X) is a Grothendieck topos.

Being Grothendieck is generally thought of as a mild condition on a topos. A Grothendieck topos has all small limits, which immediately disqualifies toposes such as FinSet\mathbf{FinSet}, FinSet op\mathbf{FinSet}^{\mathbb{C}^{\mathrm{op}}}, etc. But other than toposes arising from finite sets (or sets subject to some other cardinality bound), most of the toposes that people have worked with are Grothendieck. A notable exception is the effective topos, the maps in which can be thought of as computable functions. Other non-Grothendieck toposes occur in the topos-theoretic approach to non-standard analysis.

There is a theorem of Giraud giving a list of conditions on a category equivalent to it being a Grothendieck topos. It includes non-elementary axioms such as ‘there is a small generating set’. (‘Non-elementary’ means that it refers to a pre-existing notion of set.) The Grothendieck toposes are sometimes regarded as the nice toposes, but perhaps the definition of Grothendieck topos is not as nice as the definition of elementary topos.

Definition 14 is not the definition of Grothendieck topos that you will find in most books. I will now give a brief indication of what the standard definition is and why it is equivalent to the one above.

Fix a small category \mathbb{C}. There is a one-to-one correspondence between the subtoposes of ^\widehat{\mathbb{C}} and the Grothendieck topologies on \mathbb{C}. A Grothendieck topology is a kind of explicit, combinatorial structure; it specifies which diagrams

Layer 1 c i c_i c j c_j c c \vdots \vdots \vdots

in \mathbb{C} are to be thought of as ‘covering families’ and which are not. (There are axioms.) The motivating example is that given a topological space XX, there is a canonical Grothendieck topology on Open(X)\mathbf{Open}(X): a family (U iU) iI(U_{i} \hookrightarrow U)_{i \in I} of subsets of UOpen(X)U \in \mathbf{Open}(X) is covering if and only if U= iIU iU = \bigcup _{i \in I} U_{i}.

The bijection

{Grothendieck topologies on}{subtoposes of^} \{ \text{Grothendieck topologies on}\; \mathbb{C} \} \cong \{ \text{subtoposes of}\; \widehat{\mathbb{C}} \}

is written

JSh(,J). J \leftrightarrow \mathbf{Sh}(\mathbb{C}, J).

A pair (,J)(\mathbb{C}, J), consisting of a small category \mathbb{C} equipped with a Grothendieck topology JJ, is called a site, and Sh(,J)\mathbf{Sh}(\mathbb{C}, J) is the category of sheaves on that site. For example, let XX be a topological space, take =Open(X)\mathbb{C} = \mathbf{Open}(X), and take JJ to be the Grothendieck topology mentioned above; then Sh(,J)=Sh(X)\mathbf{Sh}(\mathbb{C}, J) = \mathbf{Sh}(X). Most books proceed as follows: define Grothendieck topology, define site, define the category of sheaves on a site, then define a Grothendieck topos to be a category equivalent to the category of sheaves on some site.

I do not know a short way to explain why the subtoposes of ^\widehat{\mathbb{C}} correspond to the Grothendieck topologies on \mathbb{C}. The following two paragraphs may make it seem easier, or harder.

First, there is an explicit classification of the subtoposes of any topos \mathcal{E}. Indeed, it can be shown that the subtoposes of \mathcal{E} correspond to the maps j:ΩΩj\colon \Omega \to \Omega satisfying certain equations. (Such a jj is called a Lawvere--Tierney topology on \mathcal{E}, although this is so distant from the original usage of the word ‘topology’ that some people object; Peter Johnstone, for instance, uses local operator instead.) By definition of subobject classifier, it is equivalent to say that a subtopos of \mathcal{E} amounts to a subobject of Ω\Omega satisfying certain axioms.

Second, take =^\mathcal{E}= \widehat{\mathbb{C}}. We know (Example 6(4)) that Ω^\Omega \in \widehat{\mathbb{C}} is given by Ω(c)={sieves onc}\Omega (c) = \{ \text{sieves on}\; c \} . Hence a subtopos of ^\widehat{\mathbb{C}} corresponds to a collection of sieves in \mathbb{C}, satisfying certain axioms. Calling these the ‘covering sieves’ gives the notion of Grothendieck topology.

Locales

Here I will explain the ‘small lie’ mentioned several times above, and make amends. I will also explain why topos theorists are fond of jokes about pointless topology.

The definition of sheaf on a topological space XX does not mention the points of XX. It mentions only the open sets and inclusions between them, and uses the fact that it is possible to take arbitrary unions and finite intersections of open sets. Having observed this, you can see why the space XX cannot always be recovered from the topos Sh(X)\mathbf{Sh}(X). For instance, if XX is indiscrete (has no open sets except \emptyset and XX) and nonempty, then Sh(X)\mathbf{Sh}(X) is the same no matter how many points XX has.

The idea now is to split the process XSh(X)X \mapsto \mathbf{Sh}(X) into two steps. First, we forget the points of XX, leaving just the set of open sets, ordered by inclusion. Then, we form the category of ‘sheaves’ on that ordered set (defined as for topological spaces, almost verbatim).

Definition

A frame is a partially ordered set such that every subset has a join (== least upper bound == sup), every finite subset has a meet (== greatest lower bound == inf), and finite meets distribute over joins. A map of frames is a map preserving order, joins and finite meets.

A topological space XX has a frame Open(X)\mathbf{Open}(X) of open subsets, and a continuous map f:XYf\colon X \to Y induces a map f 1:Open(Y)Open(X)f^{-1}\colon \mathbf{Open}(Y) \to \mathbf{Open}(X) of frames. This gives a functor

Open:TopSpFrame op. \mathbf{Open}\colon \mathbf{TopSp}\to \mathbf{Frame}^{\mathrm{op}}.

We now perform a linguistic manoeuvre. Frame op\mathbf{Frame}^{\mathrm{op}} is the desired category of ‘pointless spaces’. But we cannot wholeheartedly say that a frame is a pointless space, because the maps of frames are the wrong way round. So we introduce a new word—locale—and define the category Loc\mathbf{Loc} of locales by Loc=Frame op\mathbf{Loc}= \mathbf{Frame}^{\mathrm{op}}. We can wholeheartedly say that a locale is a pointless space.

There is a functor Sh:LocTopos\mathbf{Sh}\colon \mathbf{Loc}\to \mathbf{Topos}, defined just as for topological spaces except that unions become joins and intersections become meets. The functor Sh:TopSpTopos\mathbf{Sh}\colon \mathbf{TopSp}\to \mathbf{Topos} factorizes as

TopSpOpenLocShTopos. \mathbf{TopSp}\stackrel{\mathbf{Open}}{\to } \mathbf{Loc}\stackrel{\mathbf{Sh}}{\to } \mathbf{Topos}.

This is the two-step process mentioned above.

Whenever I have said ‘modulo a small lie’, you can interpret that as ‘use locales instead of topological spaces’. For example, Sh:LocTopos\mathbf{Sh}\colon \mathbf{Loc}\to \mathbf{Topos} really is full and faithful, in a suitably up-to-isomorphism sense: locale maps XYX \to Y correspond one-to-one with isomorphism classes of geometric morphisms Sh(X)Sh(Y)\mathbf{Sh}(X) \to \mathbf{Sh}(Y). This means that Loc\mathbf{Loc} is equivalent to a full subcategory of Topos\mathbf{Topos}. (Actually it is an equivalence of 2-categories, but I will gloss over that point.)

Every locale gives rise to a topos—but the converse is also true. Given a topos \mathcal{E}, the subobjects of 11 form a poset Sub (1)\mathrm{Sub}_{\mathcal{E}}(1). Assuming that \mathcal{E} has enough colimits, Sub (1)\mathrm{Sub}_{\mathcal{E}}(1) is a frame. This process defines a functor

Topos Loc Sub (1). \begin{array}{ccc} \mathbf{Topos}&\to &\mathbf{Loc}\\ \mathcal{E}&\mapsto &\mathrm{Sub}_{\mathcal{E}}(1). \end{array}

I am now quietly changing Topos\mathbf{Topos} to mean the toposes with small colimits; this includes all Grothendieck toposes.

You might think that 11 could have no interesting subobjects, since that is the case in the most obvious topos, Set\mathbf{Set}. But there are toposes that are nearly as obvious in which Sub (1)\mathrm{Sub}_{\mathcal{E}}(1) is not trivial. For instance, take =Set I\mathcal{E}= \mathbf{Set}^{I} for any set II: then Sub (1)\mathrm{Sub}_{\mathcal{E}}(1) is the power set of II.

Now a wonderful thing is true. The functor just defined is left adjoint to the inclusion Sh:LocTopos\mathbf{Sh}\colon \mathbf{Loc}\hookrightarrow \mathbf{Topos}. This means that Loc\mathbf{Loc} is (equivalent to) a reflective subcategory of Topos\mathbf{Topos}. Hence the counit is an isomorphism:

XSub Sh(X)(1) X \cong \mathrm{Sub}_{\mathbf{Sh}(X)}(1)

for any locale XX. This is how you recover a locale from its topos of sheaves.

So Loc\mathbf{Loc} sits inside Topos\mathbf{Topos} as a subcategory of the best kind: full and reflective, like abelian groups in groups. It is reasonable to say that a locale is a special sort of topos. More formally, a topos is localic if it is of the form Sh(X)\mathbf{Sh}(X) for some locale XX. Localic toposes are easy to work with; if you were having trouble proving something for arbitrary toposes, you might start by trying to prove it in this special case.

Since every locale is of the form Sub (1)\mathrm{Sub}_{\mathcal{E}}(1) for some topos \mathcal{E}, locale theory can be regarded as the fragment of topos theory concerning subobjects of 11. A subobject of 11 is a map 1Ω1 \to \Omega , which can reasonably called a truth value. In that sense, locale theory is the study of truth values.

The notion of locale can also be seen as a decategorification of the notion of Grothendieck topos. A poset PP is a category enriched in the two-element totally ordered set 22. There is a Yoneda embedding P2 P opP \to 2^{P^{\mathrm{op}}}, which has a finite-meet-preserving left adjoint if and only if PP is a frame. Analogously, it is almost true that for a category \mathcal{E}, the Yoneda embedding Set op\mathcal{E}\to \mathbf{Set}^{\mathcal{E}^{\mathrm{op}}} has a finite-limit-preserving left adjoint if and only if \mathcal{E} is a Grothendieck topos. (This result is due to Street (1981). ‘Almost’ refers to a set-theoretic size condition.) A map of frames is a function preserving joins and finite meets, and the inverse image part of a geometric morphism is a functor preserving colimits and finite limits. Thus, locales play roughly the same role among 2-enriched categories as Grothendieck toposes play among Set\mathbf{Set}-enriched categories.

How much has been lost by passing from topological spaces to locales? In most people’s view, not much. For example, we observed that all nonempty indiscrete spaces give rise to the same locale; but many mathematicians regard indiscrete spaces with 2\geq 2 points as ‘pathological’ and would be positively happy to see them go.

In fact, some things are gained. For example, a subgroup of a topological group need not be closed, and non-closed subgroups are often regarded as pathological (since the corresponding quotients are non-Hausdorff). But it is a theorem that every subgroup of a localic group is closed. See for instance Section C5.3 of Johnstone (2003).

The functor Open:TopSpLoc\mathbf{Open}\colon \mathbf{TopSp}\to \mathbf{Loc} has a right adjoint, which I will not describe. As mentioned above, every adjunction restricts canonically to an equivalence between full subcategories. In this case, this gives an equivalence between:

Another way of interpreting the phrase ‘modulo a small lie’ is ‘true for sober spaces’. Sobriety amounts to a rather mild separation condition. For example, every Hausdorff space is sober. So in passing from a Hausdorff space to a locale, or to a topos, nothing whatsoever is lost.

There is a kind of attitudinal paradox here. Many algebraic topologists think only about Hausdorff spaces, and regard non-Hausdorff spaces as pathological. But these are often the same people who feel strongly that topological spaces are not really about open sets; they think in terms of points and paths and homotopies. So it is perhaps paradoxical that the Hausdorff condition guarantees that a space can be understood in terms of its open sets alone: the topos of sheaves depends on nothing else, and contains all the information about the original space.

Toposes and universal algebra

The point of this section is to explain what people mean when they talk about the classifying topos of a theory. Another way to look at it is this: I will explain how toposes can be viewed as cousins of operads and Lawvere theories.

In classical universal algebra, an algebraic theory (or strictly, a presentation of an algebraic theory) consists of a bunch of operation symbols of specified arities, together with a bunch of equations. To take the standard example, the (usual presentation of the) theory of groups consists of

  • an operation symbol 11 of arity 00

  • an operation symbol () 1(\:\:)^{-1} of arity 11

  • an operation symbol \cdot of arity 22

together with the usual equations. You can speak of ‘models’ of an algebraic theory in any category \mathcal{E} with finite products. In our example, they are the internal groups in \mathcal{E}.

But there are other ways of looking at such theories.

Consider the free finite product category 𝒯\mathcal{T} equipped with an internal group. (There are general reasons why such a thing must exist.) Its universal property is that for any finite product category \mathcal{E}, the finite-product-preserving functors 𝒯\mathcal{T}\to \mathcal{E} correspond to the internal groups in \mathcal{E}.

Concretely, 𝒯\mathcal{T} looks something like this. It must contain an object XX, the underlying object of the internal group. Since 𝒯\mathcal{T} has finite products, it must also contain an object X nX^{n} for each nn \in \mathbb{N}. There is no reason for it to have any other objects, and since it is free, it does not. A map X nX mX^{n} \to X^{m} is (by definition of product) an mm-tuple of maps X nXX^{n} \to X; and the maps X nXX^{n} \to X are (by freeness) whatever maps G nGG^{n} \to G must exist for any internal group GG in any finite product category. That is, they are the nn-ary operations in the theory of groups: the words in nn letters.

This category 𝒯\mathcal{T} is called the Lawvere theory of groups. The same goes for rings, lattices, etc. In all these cases, 𝒯\mathcal{T} is a finite product category with the further property that the objects are in bijection with the natural numbers, the product of objects corresponding to addition of numbers. This further property holds because the theories described so far have been single-sorted: a model is a single object equipped with some structure.

But there are also many-sorted theories, such as the two-sorted theory of pairs (R,M)(R, M) in which RR is a ring and MM an RR-module. So we can widen the notion of algebraic theory to include all (small) finite product categories. Some people say that an algebraic theory is just a finite product category. Others say that algebraic theories correspond to finite product categories. Others still, more traditionally, say that algebraic theories correspond to only certain finite product categories.

Terminology aside, we can play the same game for other classes of limit. For example, it makes no sense to talk about internal categories in an arbitrary finite product category, because the definition of internal category needs pullbacks. (Composition in an internal category \mathbb{C} is a map 1× 0 1 1\mathbb{C}_{1} \times _{\mathbb{C}_{0}} \mathbb{C}_{1} \to \mathbb{C}_{1}.) But we can talk about internal categories in a finite limit category; and as before, there is a free finite limit category 𝒯\mathcal{T} equipped with an internal category. This means that for any finite limit category \mathcal{E}, the finite-limit-preserving functors 𝒯\mathcal{T}\to \mathcal{E} correspond to the internal categories in \mathcal{E}. A small finite limit category is called (or corresponds to) an essentially algebraic theory.

In a category with finite products you can talk about internal groups but not, in general, internal categories. In a category with finite limits you can talk about both. By extending the list of properties that the category is assumed to satisfy, you can accommodate more and more sophisticated kinds of theory. (The theory of internal categories is more ‘sophisticated’ than that of groups in the sense that composition is only defined for some pairs of maps, whereas classical universal algebra can only handle operations defined on all pairs.) The properties need not be of the form ‘limits of such-and-such a type exist’. For example, it is sometimes useful to assume epi-mono factorization, as we shall see.

There is a trade-off here. As you allow more sophisticated language, you widen the class of theories that can be expressed, but you narrow the class of categories in which it makes sense to take models. (You also make more work for yourself.) In the same way, if you trade in your motorbike for a double-decker bus, you increase the number of passengers you can carry, but you restrict where you can carry them: no low bridges or tight alleyways. (You also increase your fuel costs.) It is sensible, then, to use the smallest class of theories containing the ones you are interested in. For example, you could treat groups as an essentially algebraic theory, but that would mean you could only take models in categories with all finite limits, when in fact just products would do.

Before I get onto toposes, I want to point out a slightly different direction that you can take things in. Rather than just altering the properties that the categories are assumed to have, you can also alter the structure with which they are equipped.

Take monoidal categories, for instance. We can speak of internal monoids in any monoidal category. Hence, the theory of monoids can be regarded as the free monoidal category containing an internal monoid. (This is in fact the category of finite ordinals.) Similarly, it makes sense to speak of algebras for an operad PP in any monoidal category, and we can associate to PP the free monoidal category 𝒯\mathcal{T} containing a PP-algebra. Thus, for any monoidal category \mathcal{E}, monoidal functors 𝒯\mathcal{T}\to \mathcal{E} correspond to PP-algebras in \mathcal{E}.

We might define a monoidal theory to be a small monoidal category. This gets us into the territory of PROPs, where there are nontrivial theorems such as the classification of 2-dimensional topological quantum field theories: the symmetric monoidal theory of (or, ‘PROP for’) commutative Frobenius algebras is the category of smooth 1-manifolds and diffeomorphism classes of cobordisms.

All of this is to give an impression of how far-reaching these ideas are. It is a sketch of the context in which classifying toposes can be understood.

You will have guessed that the same kind of thing can be said for toposes as for categories with finite products, finite limits, etc. Since toposes have very rich structure (much more than just finite limits), they correspond to a very wide class of theories indeed.

An example of the kind of theory that can be interpreted in a topos is the theory of fields. (This is rather a feeble example, but I want to keep it simple.) A field is, of course, a commutative ring RR satisfying the axioms

(7)01 0 \neq 1

and

(8)xR,x=0ory:xy=1. \forall x \in R, \quad x = 0 \;\text{or}\; \exists y\colon x y = 1.

By a mechanical process, this definition can be turned into a definition of ‘internal field in a topos’. As compensation for the imprecision of the rest of this section, I will give the definition in detail; but if you want to skip it, the point to retain is that it is a mechanical process.

Let \mathcal{E} be a topos. We certainly know how to define ‘commutative ring in \mathcal{E}’: that makes sense in any category with finite products. Let RR be a commutative ring in \mathcal{E}. The nontriviality axiom, 010 \neq 1, is expressed by saying that the equalizer of

110R 1 \underoverset{\quad 1 \quad }{0}{\rightrightarrows } R

is the initial object 00. For the other axiom, let us first define the subobject URU \rightarrowtail R consisting of the units (invertible elements). The ‘setP={(x,y)|xy=1}P = \{ (x, y) \:|\:x y = 1 \} is the pullback

P 1 1 R×R R. \begin{matrix} P & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & 1 \\ \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"> <g transform="translate(0 2) scale(1 -1) translate(0 42)"> <g stroke="#000"> <g fill="#000"> <g stroke-width=".4pt"> <path d="m0-3.1v-36" fill="none"/> <g transform="matrix(0 1 -1 0 0 -3.1)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m2.8 3.2c-.2-1.2-2.4-3-3-3.2.6-.2 2.8-2 3-3.2" fill="none"/> </g> </g> </g> </g> </g> <g transform="matrix(0 -1 1 0 0 -39)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg>\end{svg}} \mathrlap{\array{\arrayopts{\align{bottom}}\space{0}{30}{10}\begin{svg} <svg height='10.40001pt' viewBox='-0.2 -0.2 10.40001 10.40001 ' width='10.40001pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0,10.20001 ) scale(1,-1) translate(0,0.2 )'><g><g stroke='rgb(0.0%,0.0%,0.0%)'><g fill='rgb(0.0%,0.0%,0.0%)'><g stroke-width='0.4pt'><g><path d=' M 0.0 0.0 L 10.00002 0.0 L 10.00002 10.00002 ' style='fill: none;'/></g></g></g></g></g></g></svg>\end{svg}}} && \array{\begin{svg} <svg viewBox="-3.99994 -42.00003 7.99988 44.0 " width="8pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" height="44pt"> <g transform="translate(0 2) scale(1 -1) translate(0 42)"> <g stroke="#000"> <g fill="#000"> <g stroke-width=".4pt"> <path d="m0-3.1v-36" fill="none"/> <g transform="matrix(0 1 -1 0 0 -3.1)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m2.8 3.2c-.2-1.2-2.4-3-3-3.2.6-.2 2.8-2 3-3.2" fill="none"/> </g> </g> </g> </g> </g> <g transform="matrix(0 -1 1 0 0 -39)"> <g stroke-width=".4pt"> <g stroke-dasharray="none" stroke-dashoffset="0pt"> <g stroke-linecap="round"> <g stroke-linejoin="round"> <path d="m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2" fill="none"/> </g> </g> </g> </g> </g> </g> </g> </g> </g> </svg>\end{svg}}\mathrlap{1} \\ R \times R & \begin{svg} <svg height='8pt' viewBox='-1.99997 -3.99994 44.0 7.99988 ' width='44pt' xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'><g transform='translate(0 4) scale(1 -1) translate(0 4)'><g stroke='#000'><g fill='#000'><g stroke-width='.4pt'><path d='m0 0h39' fill='none'/><g transform='matrix(1 0 0 1 39 0)'><g stroke-width='.4pt'><g stroke-dasharray='none' stroke-dashoffset='0pt'><g stroke-linecap='round'><g stroke-linejoin='round'><path d='m-2.4 3.2c.2-1.2 2.4-3 3-3.2-.6-.2-2.8-2-3-3.2' fill='none'/></g></g></g></g></g></g></g></g></g></svg>\end{svg} & R. \end{matrix}

Now we want to define the ‘setUU of units as the image of the composite map

f=(PR×Rpr 1R). f = \left ( P \rightarrowtail R \times R \stackrel{\mathrm{pr}_{1}}{\to } R \right ).

We can talk about images in a topos, since every map in a topos factorizes essentially uniquely as an epi followed by a mono. So, define URU \rightarrowtail R by the factorization

f=(PUR). f = \left ( P \twoheadrightarrow U \rightarrowtail R \right ).

The second field axiom states that every element of RR lies in either the subobject 10R1 \stackrel{0}{\rightarrowtail } R or the subobject URU \rightarrowtail R. In other words, it states that the map

1+UR 1 + U \to R

is epi. Here we have used the fact that every topos has coproducts, written ++.

If you have read Section 2, you will recognize that the informal talk of ‘sets’ (really, objects of \mathcal{E}) and the use of set-theoretic notation? {|}\{ \ldots \:|\:\ldots \} are something to do with the internal language of a topos. This gives a hint of how the process can be mechanized.

(There are actually several possible theories of fields, depending on exactly how you write down the axioms. They all have the same models in Set\mathbf{Set}—namely, fields—but they do not have the same models in other toposes. For example, a genuinely different theory is obtained by changing axiom (8) to ‘xR\forall x \in R, (¬y:xy=1)x=0(\not \!\exists y: x y = 1) \implies x = 0’. But this does not affect the main point: given a list of formally-expressed axioms such as (7) and (8), there is an automatic process converting it into a definition that makes sense in an arbitrary topos.)

You now have the choice between a short story and a long story.

The short story is that what we did for finite product and finite limit categories can also be done for toposes. The theories corresponding to toposes are called the geometric theories, and the topos corresponding to a particular geometric theory is called its classifying topos.

The long story is longer because there are two different notions of map of toposes—and you need to decide what a map of toposes is in order to state the universal property of the topos resulting from a theory.

The more obvious but less used notion of map of toposes is a functor preserving all the structure in sight: finite limits, exponentials, and the subobject classifier. These are called logical morphisms. Now in a topos, you can interpret a really vast range of theories: any ‘higher-order theory’, in fact. (First order means that you can only quantify over elements of a set; in a second order theory you can also quantify over subsets of a set; and so on.) Models of any such theory get along well with logical morphisms, because logical morphisms preserve everything. So you can tell a similar story for toposes, logical morphisms and higher order theories as for finite product categories, finite-product-preserving functors and algebraic theories.

The more popular notion of map of toposes is that of geometric morphism. (Here it helps to have read Section 3, where the definition is motivated.) A geometric morphism between toposes is a functor with a finite-limit-preserving left adjoint. The corresponding theories are the geometric theories. I will not give the definition, but it is not too bad an approximation to say that they are the same as the first-order theories: every geometric theory is first-order, and almost every first-order theory that one encounters is geometric.

Given a geometric theory, a classifying topos for the theory is a cocomplete topos 𝒯\mathcal{T} with the property that for any cocomplete topos \mathcal{E}, models of the theory in \mathcal{E} correspond naturally to geometric morphisms 𝒯\mathcal{E}\to \mathcal{T}. Every geometric theory has a classifying topos.

There are two surprises here. One is the appearance of the word ‘cocomplete’, which I will not explain and will not bother inserting below. It is generally thought of as a mild condition (satisfied by any Grothendieck topos, for instance).

The bigger surprise is the reversal of direction. The previous cases lead us to expect models in \mathcal{E} to correspond to maps 𝒯\mathcal{T}\to \mathcal{E}. However, since a geometric morphism is a pair of adjoint functors, the choice of direction is a matter of convention. As the name suggests, the choice that society made was motivated by geometry. Perhaps if the motivation had been universal algebra, it would have been the other way round. (This is an aspect of the thought that geometry is dual to algebra.) A map of toposes would then have been a finite-limit-preserving functor with a right adjoint, which is more or less the same thing as a functor preserving finite limits and small colimits.

If a topos is thought of as a generalized space (as in Section 3) then the classifying topos of a theory can be thought of as its space of models. Indeed, a point of the classifying topos 𝒯\mathcal{T} is (by Definition 12) a geometric morphism Set𝒯\mathbf{Set}\to \mathcal{T}, which is exactly a model of the theory in Set\mathbf{Set}. Some familiar topological spaces can be construed as classifying toposes. For example, there is a ‘theory of Dedekind cuts’ whose classifying topos is Sh()\mathbf{Sh}(\mathbb{R}), that is, \mathbb{R} regarded as a topos.

Given how much structure a topos contains, it is surprising how many classifying toposes can be described simply. I will now describe the classifying topos of any algebraic theory, by the venerable expository device of doing it just for groups.

We will need the notion of finite presentability. A group (in Set\mathbf{Set}) is finitely presentable if it admits a presentation by a finite set of generators subject to a finite set of relations. The category of finitely presentable groups and all homomorphisms between them will be written Grp fp\mathbf{Grp}_{\mathrm{fp}}.

Aside

Finite presentability is a more categorical concept than it might seem. Writing T:SetSetT\colon \mathbf{Set}\to \mathbf{Set} for the free group monad, a relation (equation) in a set XX of generators is an element of TX×TXT X \times T X. So, a family (r i) iI(r_{i})_{i \in I} of relations is a map ITX×TXI \to T X \times T X, or equivalently a diagram

ITX I \rightrightarrows T X

in Set\mathbf{Set}, or equivalently a diagram

FIFX F I \rightrightarrows F X

in Grp\mathbf{Grp}, where F:SetGrpF\colon \mathbf{Set}\to \mathbf{Grp} is the free group functor. The group presented by these generators and relations is the coequalizer of this diagram in Grp\mathbf{Grp}. Hence a group is finitely presentable precisely when it is the coequalizer of some diagram FIFXF I \rightrightarrows F X in which II and XX are finite sets.

This formulation of finite presentability in Grp\mathbf{Grp} uses the free group functor FF. But in fact, there is a general definition of finite presentability of an object of any category. I will not go into this.

As promised, the classifying topos for groups is easy to describe:

Theorem

The classifying topos for groups is Set Grp fp\mathbf{Set}^{\mathbf{Grp}_{\mathrm{fp}}}.

In other words, for any topos \mathcal{E}, a group in \mathcal{E} is the same thing as a geometric morphism Set Grp fp\mathcal{E}\to \mathbf{Set}^{\mathbf{Grp}_{\mathrm{fp}}}.

The same goes for other algebraic theories. This yields something interesting even for very trivial theories. Take the theory of objects, whose models in a category \mathcal{E} are simply objects of \mathcal{E}. A finitely presentable set is just a finite set. Hence for any topos \mathcal{E}, objects of \mathcal{E} correspond to geometric morphisms Set FinSet\mathcal{E}\to \mathbf{Set}^{\mathbf{FinSet}}. The topos Set FinSet\mathbf{Set}^{\mathbf{FinSet}} is therefore called the object classifier.

We have been asking, for a given theory, ‘what topos classifies it?’ But we can turn the question round and ask, for a given topos 𝒯\mathcal{T}, ‘what does 𝒯\mathcal{T} classify?’ In other words, what are the geometric morphisms from an arbitrary topos \mathcal{E} into 𝒯\mathcal{T}? It is a fact that every topos 𝒯\mathcal{T} is the classifying topos of some geometric theory—although given how wide a class of theories that is, perhaps this does not say very much.

There are clean answers to this reversed question for many toposes 𝒯\mathcal{T}. In particular, this is so when 𝒯\mathcal{T} is the topos Sh(,J)\mathbf{Sh}(\mathbb{C}, J) of sheaves on a site (Section 3). Here I will just tell you the answer for a smaller class of toposes.

Theorem

Let \mathbb{C} be a category with finite limits. Then the presheaf topos ^\widehat{\mathbb{C}} classifies finite-limit-preserving functors out of \mathbb{C}.

In other words, for any topos \mathcal{E}, a geometric morphism ^\mathcal{E}\to \widehat{\mathbb{C}} is the same thing as a finite-limit-preserving functor \mathbb{C} \to \mathcal{E}.

(If you know about flat functors, you can drop the assumption that \mathbb{C} has finite limits: for any small category \mathbb{C}, the presheaf topos ^\widehat{\mathbb{C}} classifies flat functors out of \mathbb{C}. This is one version of Diaconescu's Theorem.)

So there is a back-and-forth translation between geometric theories and the toposes that classify them. In many cases, this translation is surprisingly straightforward.

References

=–

2011 2012-08-23T19:05:54Z 2011-11-02T09:25:27Z tag:ncatlab.org,2011-11-02:Publications,2011 Urs Schreiber

2011, 2012


Publications of the nLab, Volume 1 (2011)

Contents

  • vol. 1, no. 1Tom Leinster, An informal introduction to topos theory

    article type: expository article

    submission type: author submission of arXiv article

    submitted: Jan, 2011

    refereeing: by expert anonymous referee chosen by the nLab steering committee

    status: accepted for publication, 27th June, 2011

    referee reports: see here

Andrew Stacey 2011-11-02T17:33:24Z 2011-11-02T17:33:24Z tag:ncatlab.org,2011-11-02:Publications,Andrew+Stacey Andrew Stacey

About Me

I am a førsteamanuensis at the Department of Mathematics at the Norwegian University of Science and Technology, otherwise known as NTNU.

My Role on the n-Lab

One of my roles here is as the local sysadmin. Therefore if someone has any technical questions or problems on the n-lab, it’s a good idea to ask them somewhere that I’ll see them. The best place is at the n-Forum (which I also maintain) since if I’m online I’m probably keeping an eye on the forum, and also there are other people equally (or better) able to answer technical questions and they tend to keep an eye on the forum as well.

My nLab page is Andrew Stacey.

Other Information

My professional homepage is http://www.math.ntnu.no/~stacey, which has details of my research, seminars, teaching, and other vaguely work-related matters.

SVG uparrowtail 2011-11-02T14:08:37Z 2011-11-02T14:08:37Z tag:ncatlab.org,2011-11-02:Publications,SVG+uparrowtail Andrew Stacey
SVG uparrow 2011-11-02T14:08:25Z 2011-11-02T14:08:25Z tag:ncatlab.org,2011-11-02:Publications,SVG+uparrow Andrew Stacey
SVG twoheaduparrow 2011-11-02T14:08:13Z 2011-11-02T14:08:13Z tag:ncatlab.org,2011-11-02:Publications,SVG+twoheaduparrow Andrew Stacey
SVG twoheadswarrow 2011-11-02T14:08:00Z 2011-11-02T14:08:00Z tag:ncatlab.org,2011-11-02:Publications,SVG+twoheadswarrow Andrew Stacey
SVG twoheadsearrow 2011-11-02T14:07:47Z 2011-11-02T14:07:47Z tag:ncatlab.org,2011-11-02:Publications,SVG+twoheadsearrow Andrew Stacey