In material set theory, there is an intuitive conception of what a set is, which may be stated informally as follows: a set is a collection of sets. Actually, it is possible to have urelements in a material set theory (such as ZFA), although the most common axiom systems do not allow this; in any case we can say that a pure set is a collection of pure sets.
The primary motivation for structural set theory is that this conception of a set is not needed in ordinary mathematics; it is sufficient to characterise the category of sets (although a structural set theory can also be described in ways other than category-theoretic). However, material set theory is itself part of mathematics, and we may want to describe the material notion of a pure set in structural terms.
There is also a very practical point to this exercise: the translation between material and structural set theories. Any model of a material set theory is already a model of a corresponding structural set theory, but we go through the yoga below to construct a model of a material set theory out of a model of a structural set theory. In particular, we must do this to prove that two set theories (one material and one structural) are equiconsistent.
Taking a structural notion of set as granted, we wish to define a pure set to be a set of pure sets. On the face of it, this is a circular definition, but like many such definitions, it can be made precise in (at least) two ways: recursively and corecursively.
The ‘recursive’ meaning is that all pure sets must be constructed, starting from nothing, as sets of other previously constructed pure sets. This results in the well-founded pure sets. Thus, at first, the only well-founded set possible is the empty set $\empty = \{ \}$, the set of no pure sets. Once you have that, you can form $\star = \{\empty\}$, then $\{\star\}$ and $\{\empty,\star\}$, and so on. In this way we can obtain at least all hereditarily finite sets; if we use an axiom of infinity as well, we can jump to the countably infinite set of all hereditarily finite sets and continue from there.
The ‘corecursive’ meaning is that any pure set can be deconstructed into a set of other pure sets, and every possible such deconstruction defines a unique pure set. In this way we obtain not just well-founded pure sets but also ill-founded pure sets. (The inclusion of well-founded sets among the ill-founded sets is an example of the red herring principle.) Possible examples of non-well-founded sets include a set $\bullet$ such that $\bullet = \{\bullet\}$ (a suggestive model for the point), or sets $A$ and $B$ such that $A = \{B\}$ and $B = \{\empty, A\}$.
In addition, the meaning of this definition changes according to whether our set-theoretic foundation is material or structural.
If the foundation is ‘material’ or ‘membership-based,’ such as ZFC, then the elements of a set can, in fact, be other sets. Therefore, in this case we are defining (either recursively or corecursively) an adjective “pure” that can be applied to the noun “set:” a set is pure if at no point in its construction or deconstruction into elements do we encounter anything that is not a pure set.
In the most common material set theories, such as ZFC, all sets are pure, since the only ‘things’ the theory deals with (hence the only things that can be elements of sets) are sets. However, there are easy modifications of these theories that allow ‘atoms’ or urelements that are not sets, and in this case the pure sets will be those that ‘hereditarily’ contain no atoms. Many common material set theories (starting with von Neumann 1925, Zermelo 1930) also include an axiom of foundation asserting that all (pure) sets are well-founded; the dual axiom of anti-foundation (due to Aczel) allows and ‘tames’ the ill-founded sets.
If the foundation is ‘structural’ or ‘categorial’, such as ETCS or SEAR, then the elements of a set cannot be other sets. Thus, in this case we are defining a single noun “pure set,” with no a priori relation to the structural notion of “set” that occurs in the foundational theory. A pure set, according to this definition, is a set equipped with structure assigning to each of its elements another pure set (interpreted either recursively or corecursively).
From this point of view, the definition of pure set provides a construction of a model of material set theory within a model of structural set theory: a global relation of $\in$ can be defined between pure sets which will satisfy the axioms of some material set theory. (Of course, the sets in a model of material set theory always model a structural set theory, so this is the ‘difficult’ direction of the equivalence between the two types of set theory.) Whether we use the recursive or corecursive definition determines whether the resulting material set theory will satisfy the axiom of foundation or the axiom of anti-foundation. Note that using these two choices, the composite operation “material $\to$ structural $\to$ material” reduces to the standard proofs of the relative consistency of the axioms of foundation and anti-foundation, since the passage from material to structural requires neither one.
In material set theory without urelements, every set is a pure set. If there are urelements, so that not all sets are pure, then it is easy to define the class of pure sets as follows:
A set $x$ is pure if given any sequence $x_n \in x_{n-1} \in \dots \in x_1 \in x_0 = x$, all of the $x_i$ are sets.
Whether this produces the well-founded pure sets or the ill-founded ones depends on whether the axiom of foundation is satisfied in the ambient set theory.
In a structural set theory like ETCS or SEAR, we can model a pure set by a graph describing its hereditary membership relation.
One way to state the basic theoretical idea is that the class of well-founded sets is the initial algebra of the covariant power set functor, while the class of ill-founded sets is the terminal coalgebra of the same functor. Of course, neither of these algebras exists as a set, since this would violate Cantor's theorem, but we can still describe what their elements would be like. We can also define these algebras as discrete large categories, or as proper classes in a structural set-class theory such as algebraic set theory.
The discussion which follows is phrased informally, like most mathematics. However, it is purely structural and can be interpreted in any structural set theory. For instance, the definition below of a “graph” as a set with a binary relation should be formalized as a set $G$ together with an injection $R \hookrightarrow N\times N$.
To describe a pure set, we must give all the elements of that set, each of which is a pure set and thus must have its own elements, which are also pure sets, and so on. A convenient way to “picture” a pure set is with a graph.
For the purposes of this page, a graph will mean a set $G$ of nodes equipped with a binary relation $\to$ on the nodes. A node $i$ is called a child of a node $j$ if $i \to j$.
The idea of a graph-picture of pure sets is that the nodes represent pure sets, and we have $i\to j$ precisely when $i\in j$. Clearly the membership relations between any collection of pure sets should be representable by a graph. To what extent the converse holds, i.e. which graphs can be interpreted as membership-relations between pure sets, depends on whether the pure sets in question are well-founded or ill-founded. For purposes of drawing well-founded pure sets, only well-founded graphs should be used, whereas arbitrary graphs can be used to draw ill-founded sets. (In fact, one form of the axiom of anti-foundation is that any graph is the picture of some pure sets.)
A set $S$ of nodes in a graph is $\to$-inductive if whenever all children of some node $i$ are in $S$, then $i$ itself is in $S$. A graph is well-founded if the only $\to$-inductive set of nodes is the set of all nodes.
Assuming the principle of excluded middle, well-foundedness is equivalent to saying that the tree has no infinite paths.
If we want to use a graph to describe a specific pure set, then we should single out one particular node as representing that set. Moreover, it makes sense to demand that the graph contain no “superfluous” data other than what is necessary to describe the hereditary membership structure of the particular set in question.
A graph is pointed if it is equipped with a specified node $\top$ called the root. A pointed graph is accessible if for every node $x$, there exists a path $x = x_0 \to x_1 \to \dots \to x_n =\top$ to the root; this is equivalent to saying that $\top$ is a top element for the reflexive-transitive closure of $\to$. An accessible pointed graph is abbreviated APG.
An APG gives all the data necessary to characterize a pure set $X$, and the pure set in question is well-founded iff the APG is well-founded. We can think of the root as representing $X$ itself, the root's children as the elements of $X$, the next level as those elements' elements, and so on.
If we want to think of these elements as pure sets themselves, then we can “externalize” them into separate APGs. Specifically, for any node $z$ in a graph $G$, the subgraph consisting of all those nodes which admit some path to $z$ is an APG, which describes the pure set represented by $z$; we write it as $G/z$ and call it the full subgraph of $G$ rooted at $z$. If $G$ is an APG and $z$ is a child of the root, we say that $G/z$ is an immediate subgraph of $G$. The immediate subgraphs of $G$ represent the pure sets that are the “elements” of the pure set represented by $G$. Thus, the APG $G$ pictures the hierarchy of elements of $X$, where each immediate subgraph $G/z$ corresponds to its root $z$, and $\to$ is interpreted as membership.
Note that the nodes in most of the pictures below have labels, but these are for illustration only; they are not part of the structure defining the tree.
First, $\empty = \{ \}$ consists of nothing but a root:
The set $\star = \{\empty\}$ has the empty set attached to the root (which we draw at the top):
Notice that $\empty$ is an immediate subgraph, so we can write $\empty \in \star$.
Then the two most obvious ways of representing the ordinal $2$ as a pure set, namely $2_Z = \{\star\}$ and $2_N = \{\empty, \star\}$, can be drawn as follows:
Looking at immediate subtrees, $\star \in 2_Z$, $\empty \in 2_N$, and $\star \in 2_N$, but $\empty \notin 2_Z$.
$2_N$ can also be represented as $\{\star, \empty\}$:
Of course, as a graph this is isomorphic to $\{\empty, \star\}$, and hence should represent the same pure set.
These sets are all well-founded. A non-well-founded set can be represented either by an infinite picture, or by a picture with loops. Here is an infinite picture of $\bullet = \{\bullet\}$:
and here is a finite picture:
In both cases, the unique immediate subgraphs verify that $\bullet \in \bullet$.
APGs are sufficient to describe all pure sets, but multiple APGs, even non-isomorphic ones, can represent the “same” pure set. Firstly, we have so far not eliminated the possibility of duplicate branches. For instance, the pure set $2_Z$ can also be represented as $\{\star, \star\}$:
It is easy to eliminate APGs with such “obviously redundant” branches.
An APG $G$ is rigid if whenever $x$ and $y$ are two children of the same node $z$ such that $G/x \cong G/y$, then in fact $x=y$.
Assuming excluded middle, rigidity is equivalent to saying that any graph automorphism is the identity function. Note also that if two rigid APGs are isomorphic, they must be uniquely isomorphic. Thus, for rigid APGs, being isomorphic is a property, rather than structure.
Even among rigid APGs, however, there are are ambiguities. For example, the pure set $2_N$ can also be drawn by the APG
The issue here is that the “same” pure set can occur in more than one place in the membership hierarchy of a pure set, and we have to choose whether or not to identify them in the graph. There are three solutions to this problem:
These are of course related; the first two can be seen as specifying a particular isomorphism class within each of the equivalence classes defined by the third. In particular, they all result in equivalent notions of pure set, when they all apply. (However, in predicative set theories, it seems that only the first method is applicable.) We will consider them all in turn.
A natural way to ensure that the same node is never “shared” between two occurrences is to require that our APGs be trees.
An pointed graph is a tree if every node admits a unique path to the root. Equivalently, $G$ is a tree if its root $\top$ is a terminal object of the quiver generated by the relation $\to$.
A tree may or may not be well-founded. Note, though, that in a tree, the opposite relation $\leftarrow$ is automatically well-founded: each node has a unique natural number height, namely the length of its unique path to the root, so well-foundedness of $\mathbb{N}$ implies well-foundedness of $\leftarrow$.
It now makes sense to define a pure set to be a rigid accessible pointed tree. The idea is that two such trees represent the same pure set iff they are isomorphic. We define “membership” $G\in H$ between such trees to mean that $G$ is isomorphic to some immediate subtree of $H$ (that is, a tree $H/z$ where $z$ is a child of the root of $H$).
For example, the tree representation of $2_N$ is the one we gave first:
and the tree representation of $\bullet = \{\bullet\}$ is the infinite path:
Of course, a well-founded infinite set also has an infinite picture, but there are no infinite paths. Here is $\omega_N$, the set of von Neumann natural numbers:
So we have $\empty \in \omega_N$, $\star \in \omega_N$, $2_N \in \omega$, etc.
A tree models a hereditarily finite set if every node has finitely many children; a tree models a well-founded hereditarily finite set if and only if it is finite and $\to$ is decidable. The relationship between these two facts and the no-infinite-path formulation of well-foundedness is a form of Kőnig's Lemma?.
Given these definitions, one can prove the various axioms of material set theory. Specifically, if ETCS holds for the structural sets (of nodes, etc) used in this definition, then $\mathbf{BZC}^-$ (which is ZFC with only bounded separation and no axioms of replacement or foundation) holds for rigid trees, while $\mathbf{BZC}$ ($\mathbf{BZC}^-$ together with foundation) holds for well-founded rigid trees. Similar results hold for constructive weakenings of ETCS and $\mathbf{BZC}^-$, since these definitions involve only function-sets and not power-sets. Likewise, we can strengthen both sides with axioms of replacement, collection, or separation, or by adding Grothendieck universes.
The argument for the axiom of extensionality is perhaps the most interesting; it goes as follows. If each immediate subtree $G/x$ of $G$ is isomorphic to an immediate subtree $H/y$ of $H$ and conversely, then $y$ and $x$ are mutually uniquely determined by rigidity, so we have a bijection between the children of the roots of $G$ and of $H$. Since the isomorphisms $G/x \cong H/y$ are also unique, again by rigidity, and each $G/x$ is disjoint from $G/x'$ for $x\neq x'$, by the tree property, we can piece together these isomorphisms to define an isomorphism $G\cong H$. (In particular, the uniqueness of all these isomorphisms means that the axiom of choice is not needed.)
(Note, though, that this is only the “weak” version of extensionality. This is sufficient for well-founded sets, but we may need to strengthen rigidity somehow to obtain stronger notions of extensionality, which are generally more appropriate for ill-founded sets. More precisely, the well-founded rigid trees model the usual axioms of extensionality and foundation, while arbitrary rigid trees model not Aczel’s axiom of anti-foundation AFA, but its “Scott” variant, SAFA.)
The “tree” representation can be thought of as an unwrapped picture; we now consider a wrapped picture in which there is no duplication at all. The natural way to ensure this is to require our graphs to be extensional. For well-founded graphs, this is easy: a well-founded graph is extensional if for any nodes $x$ and $y$ such that for all $z$, we have $z\to x$ iff $z\to y$, then necessarily $x=y$. For non-well-founded graphs, we need the notion of “strong extensionality” defined as follows.
If $G$ and $H$ are graphs, a bisimulation from $G$ to $H$ is a binary relation $\sim$ from $G$ to $H$ such that for any $x\in G$ and $y\in H$ with $x\sim y$, then:
A graph $G$ is extensional if the largest bisimulation from $G$ to $G$ is the identity.
The idea is that a bisimulation identifies pairs of nodes which “represent the same pure set.” One can show that a well-founded graph is extensional in this sense iff it is extensional in the weaker sense given above. Note also that any extensional APG is necessarily rigid, since any isomorphism $G/x \cong G/y$ induces a bisimulation.
Thus, it also makes sense to define a pure set to be an extensional APG. Two such APGs represent the same set iff they are isomorphic, and we define membership in the same way: $G\in H$ if $G\cong H/x$ for some child $x$ of the root of $H$. Of course, one can again prove the axioms of material set theory from this definition. If we restrict to well-founded extensional APGs, we get the axiom of foundation, while if we allow all extensional APGs, we get Aczel’s axiom af anti-foundation (along with its strong version of the axiom of extensionality). Note that this definition is not predicative, since the notion of extensionality involves quantification over bisimulations, which are relations, i.e. subsets. Thus, it requires at least limited separation.
For example, the extensional representation of $2_N$ is this one:
while the extensional representation of $\bullet = \{\bullet \}$ is the loop:
Finally, here is the set $\omega_N$ of von Neumann natural numbers:
Again we have $\empty \in \omega_N$, $\star \in \omega_N$, $2_N \in \omega$, etc.
Now an extensional accessible graph models a hereditarily finite set iff the graph itself is finite and $\to$ is decidable.
Finally, we consider the third solution to the ambiguity problem, and its relation to the first two: defining a looser notion of equivalence between APGs. Recall the notion of bisimulation. We say that two APGs $G$ and $H$ are equivalent if there is a bisimulation $\sim$ from $G$ to $H$ such that $\top\sim\top$. Note that if $\sim$ is a bisimulation from $G$ to $H$ such that $x\sim y$, then the APGs $G/x$ and $H/y$ are equivalent; this justifies the interpretation of a bisimulation.
We can now define a pure set to be an APG, with equality represented by equivalence in the above sense. Likewise, we define “membership” $G\in H$ to mean that $G$ is equivalent to some immediate subgraph of $H$. One can again prove the various axioms of material set theory with these definitions (either using well-founded APGs or arbitrary ones). As with extensional APGs, in this case we obtain the strong notion of extensionality (which is equivalent to the weak one for well-founded sets). This definition is also not predicative.
The relationship of this approach to the previous ones is as follows.
Every APG is equivalent to an extensional one. Namely, let $\approx$ be the maximal bisimulation on $G$ (the union of all bisimulations); it is an equivalence relation, and its quotient $G/\approx$ inherits the structure of an extensional APG which is equivalent to $G$. This is the extensional quotient; see extensional relation.
Two extensional APGs are equivalent iff they are isomorphic. For if $\sim$ is a bisimulation from $G$ to $H$, then defining $x\approx y$ in $G$ if there exists a node $z$ in $H$ with $x\sim z$ and $y\sim z$ gives a bisimulation on $G$, which must be the identity, and similarly for $H$; hence $\sim$ must in fact be a bijection. Thus, the notion of “pure set” obtained from “extensional APGs and isomorphism” is the same as that obtained from “arbitrary APGs and equivalence.”
Every APG is equivalent to one that is a tree, its “unwrapping”. The nodes of the tree can be taken to be the finite paths $x_n \to \dots \to x_0 = \top$ ending at the root, with any “one-level extension” $x_{n+1} \to x_n \to \dots \to \top$ being a child of $x_n \to \dots\to \top$. The bisimulation relates each path $x_n \to \dots\to \top$ to its initial node $x_n$.
Moreover, every APG is equivalent to a rigid accessible pointed tree. The key here is to first take the extensional quotient, then perform the above “unwrapping” construction: the unwrapping of any extensional graph will always be rigid.
Two well-founded rigid accessible pointed trees are equivalent iff they are isomorphic. For if $\sim$ is a bisimulation from $G$ to $H$, define a new relation $\simeq$ from $G$ to $H$ such that $x\simeq y$ iff there are paths from $x$ and $y$ to the corresponding roots which are pointwise related by $\sim$, i.e. we have $x = x_n \to x_{n-1} \to \cdots \to x_1 \to x_0 = \top$ and $y = y_n \to y_{n-1} \to \cdots \to y_1 \to y_0 = \top$, with $x_i \sim y_i$. We can then prove by well-founded induction that whenever $x\simeq y$, then $\simeq$ is a bijection between the children of $x$ and of $y$. For assume that all the children of $x$ and $y$ have this property. Then if we have $x' \to x$ and $x''\to x$ with $x' \sim y$ and $x'' \sim y$, then by hypothesis $\sim$ is a bijection between $G/x'$ and $H/y$, and between $G/x''$ and $H/y$, hence we have $G/x' \cong G/x''$; thus by rigidity $x'=x''$. Therefore, $\simeq$ is a bijection between the children of $x$ and the children of $y$. The construction of $\simeq$ then ensures that for any child $x'\to x$, $\simeq$ relates each node in $X/x'$ to a node in at most one $Y/y'$; hence $\simeq$ is in fact a bijection from $X/x$ to $Y/y$. Thus, by induction, $\simeq$ is a bijection.
In order to extend this last result to non-well-founded trees, we would need to use a stronger notion of rigidity. (This makes sense since SAFA is incompatible with AFA. We could expect to be able to use “Scott-extensionality” in place of extensionality to obtain a theory equivalent to that of rigid trees.) For example, the following two ill-founded trees are equivalent but not isomorphic, yet both are rigid.
In conclusion, there are two extreme ways to represent a pure set by an isomorphism class of graphs, but any accessible pointed graph will represent a pure set one way or another. For well-founded sets, the two extreme ways are equivalent, but for ill-founded sets the choice between them can make a difference (and in fact, there are a number of other possible variants, corresponding to the spectrum of anti-foundation axioms discussed by Aczel).
Note that the version on the left is the unwrapping of the loop; we would like a stronger version of rigidity that would rule out the version on the right. As it is, the best that we have come up with is simply: isomorphic to the unwrapping of some extensional APG. Technically this works, but it is not very satifsying.