elementary embedding



In model theory, models of a first-order theory TT are certain kinds of functors from the syntactic site of TT to Set\mathbf{Set}. Elementary embeddings are natural transformations between these functors.


In model theory, an elementary embedding between structures (over a given signature σ\sigma) is an injection that preserves and reflects all of first-order logic over σ\sigma. That is, it is an injection f:MNf\colon M\to N such that for any first-order formula φ\varphi written in the language of σ\sigma and parameters a 1,,a nMa_1,\dots,a_n\in M (of appropriate types), we have

Mφ(a 1,,a n)Nφ(f(a 1),,f(a n)). M \vDash \varphi(a_1,\dots,a_n) \;\iff\; N \vDash \varphi(f(a_1),\dots, f(a_n)).

In particular, this implies that if either MM or NN is a model of some first-order theory, then so is the other.

Note that the condition that ff be injective is automatic as long as the logic in question includes equality, since reflecting of the formula x=yx=y implies that ff is injective. If ff is (interpreted as) the inclusion of a subset, we say that MM is an elementary substructure of NN. We also say here that NN is an elementary extension of MM.

More generally, when we consider structures in a category as in categorical logic, a morphism f:MNf\colon M\to N between structures in CC is an elementary embedding if for any formula φ\varphi, the following square is a pullback:

[φ] M [φ] N MA 1××MA n NA 1××NA n \array{[\varphi]_M & \overset{}{\to} & [\varphi]_N\\ \downarrow && \downarrow\\ M A_1 \times\dots \times M A_n & \underset{}{\to} & N A_1 \times \dots \times N A_n}

where MA iM A_i denotes the object of CC interpreting the type A iA_i, and [φ] M[\varphi]_M denotes the corresponding subobject in CC interpreting the truth value of the formula φ\varphi. Note that for an arbitrary morphism of structures, this square need not even commute; one sometimes says that ff is an elementary morphism if it does.

Urs Schreiber: let me try to say this more explicitly, to check if I am following:

The theory TT that we are modelling is exhibited by its syntactic category Syn(T)Syn(T) with finite limits. A model of TT in a category CC with limits – equivalently a TT-structure in CC – is a finite-limit preserving functor N:Syn(T)CN : Syn(T) \to C. A morphism f:MN:Syn(T)Cf : M \to N : Syn(T) \to C of models is a natural transformation between such functors. We say that such a natural transformation is an elementary embedding if its naturality squares on certain morphisms of Syn(T)Syn(T) are pullback squares.

Mike Shulman: Not quite. First of all, the definition officially happens at the more general level of structures rather than models, but we can of course consider those as models for the empty theory. And whether we need finite-limit categories and functors, or something else like regular ones, geometric ones, or Heyting ones, depends on what fragment of logic we consider our (possibly empty) theories as living in. Your rephrasing is correct if we mean finitary first-order theories and therefore Heyting categories and Heyting functors. Otherwise, the syntactic category Syn(T)Syn(T) won’t have the structure required to construct [φ][\varphi], and the structure wouldn’t be preserved by the functors into CC, so that we wouldn’t even have naturality squares to ask to commute (I alluded to this in the last sentence above).

I think I didn’t explain this very well, but I have to go now, I’ll try to come back to it later and rewrite it to make more sense.

In practice, it is also useful to separate out the weaker notion of embedding.


Working over a fixed signature Σ\Sigma, an injective homomorphism of structures f:MNf: M \to N is an embedding if for each relation RΣR \in \Sigma of arity nn, we have R M=(f n) *R NR_M = (f^n)^\ast R_N, i.e., R MR_M is obtained by pulling back the inclusion R NN nR_N \hookrightarrow N^n along f n:M nN nf^n: M^n \to N^n.

Notice this is stronger than just being an injective homomorphism (where we demand only that R M(f n) *R nR_M \hookrightarrow (f^n)^\ast R^n, or that R M(a 1,,a n)R N(f(a 1),,f(a n))R_M(a_1, \ldots, a_n) \vdash R_N(f(a_1), \ldots, f(a_n))), and it is weaker than being an elementary embedding since here we are not demanding a pullback condition over all the formulas of the language, just the atomic ones (including equality, which is handled by injectivity as we noted earlier).

A substructure is an embedding f:MNf: M \to N where ff is a set-theoretic inclusion. (In structural set theory, a substructure would just be an isomorphism class of embeddings.)

Tarski-Vaught test

A simple criterion for when one structure is an elementary substructure of another is given by the Tarski-Vaught test. Let LL be the language (the set of first-order formulas) over some given signature.


A substructure i:MNi: M \hookrightarrow N is an elementary substructure if, whenever φ\varphi is an (n+1)(n+1)-ary formula and b 1,,b nMb_1, \ldots, b_n \in M, there exists aNa \in N such that φ(a,b 1,,b n)\varphi(a, b_1, \ldots, b_n) is satisfied in NN only if there exists cMc \in M such that φ(c,b 1,,b n)\varphi(c, b_1, \ldots, b_n) is satisfied in MM.


By induction on the structure of formulas φ\varphi, using ¬,,\neg, \wedge, \exists as primitive logical operators. The required pullback condition is satisfied on atomic formulas, by definition of substructure.

If the pullback condition is satisfied for φ\varphi (of arity nn say), then it is trivially satisfied for ¬φ\neg \varphi because for all a¯=(a 1,,a n)M n\bar{a} = (a_1, \ldots, a_n) \in M^n we have

M¬φ(a¯)iff¬[Mφ(a¯)]iff¬[Nφ(i n(a¯))]iffN¬φ(i n(a¯)).M \models \neg \varphi(\bar{a}) \;\; iff \;\; \not [M \models \varphi(\bar{a})] \;\; iff \;\; \not [N \models \varphi(i^n(\bar{a}))] \;\; iff \;\; N \models \neg\varphi(i^n(\bar{a})).

If the pullback condition is satisfied for φ\varphi and ψ\psi, then it is equally trivially satisfied for φψ\varphi \wedge \psi:

M(φψ)(a¯) iff Mφ(a¯)andMψ(a¯) iff Nφ(i n(a¯))andNψ(i n(a¯)) iff N(φψ)(i n(a¯)).\array{ M \models (\varphi \wedge \psi)(\bar{a}) & iff & M \models \varphi(\bar{a}) \; and \; M \models \psi(\bar{a}) \\ & iff & N \models \varphi(i^n(\bar{a})) \; and \; N \models \psi(i^n(\bar{a})) \\ & iff & N \models (\varphi \wedge \psi)(i^n(\bar{a})). }

Using these two steps of the induction, we may say that given a substructure, the pullback condition is satisfied for all quantifier-free formulas in the language.

Finally, suppose the pullback condition is satisfied for (n+1)(n+1)-ary formulas φ(v,w¯)\varphi(v, \bar{w}). If b¯( vφ) M\bar{b} \in (\exists_v \varphi)_M, then there exists cMc \in M such that (c,b¯)φ M(c, \bar{b}) \in \varphi_M, whence there exists a=i(c)a = i(c) such that (i(c),i n(b¯))φ N(i(c), i^n(\bar{b})) \in \varphi_N, so i n(b¯)( vφ) Ni^n(\bar{b}) \in (\exists_v \varphi)_N, just using the fact that ii is a homomorphism. Conversely: if i n(b¯)( vφ) Ni^n(\bar{b}) \in (\exists_v \varphi)_N, i.e., if there exists aNa \in N such that (a,i n(b¯))φ N(a, i^n(\bar{b})) \in \varphi_N, then by hypothesis there exists cMc \in M such that (c,b¯)φ M(c, \bar{b}) \in \varphi_M, i.e., b¯( vφ) M\bar{b} \in (\exists_v \varphi)_M. This completes the inductive proof.


In the theory of real closed fields with signature (0,1,+,,)(0, 1, +, \cdot, \leq), the field of real algebraic numbers is an elementary substructure of the field of real numbers. This follows from the Tarski-Vaught test and the Tarski-Seidenberg theorem which establishes quantifier elimination over the language generated by the signature (every formula has the same extension as some quantifier-free formula).

Another application is described at Löwenheim-Skolem theorem.



Let TT be a first-order theory, and M,N:Syn(T)SetM, N : \mathsf{Syn}(T) \to \mathbf{Set} two models of TT.

Let (Syn(T)) 0(\mathsf{Syn}(T))_0 denote the objects (definable sets of TT) of Syn(T)\mathsf{Syn}(T).

A (Syn(T)) 0(\mathsf{Syn}(T))_0-indexed collection of maps f=df(M(X)f XN(X)) Xf \overset{df}{=} \left(M(X) \overset{f_X}{\to} N(X) \right)_X is a natural transformation if and only if it preserves and reflects first-order logic.


Suppose ff is a natural transformation. Then certainly whenever XX is a definable set, xM(X)f(x)N(X).x \in M(X) \implies f(x) \in N(X).

If f(x)N(X)f(x) \in N(X) but xx is not in M(X)M(X), then xM(¬X)x \in M(\neg X), which implies f(x)N(¬X)f(x) \in N(\neg X). Since the functor NN preserves disjoint unions, N(¬X)N(X)N(\neg X) \cap N(X) is empty, which is impossible. So whenever f(x)N(X)f(x) \in N(X), xM(X)x \in M(X).

Now suppose ff preserves and reflects first-order logic. If XγYX \overset{\gamma}{\to} Y is a map in Syn(T)\mathsf{Syn}(T) and yM(Y)y \in M(Y), then

Mγ(y)M(X)Nγ(f(y))N(X) M \models \gamma(y) \in M(X) \iff N \models \gamma(f(y)) \in N(X)

and γ(f(y))=f(γ(y))\gamma(f(y)) = f(\gamma(y)) since

MΓ(y,γ(y))NΓ(f(y),fγ(y)) M \models \Gamma(y,\gamma(y)) \iff N \models \Gamma(f(y), f \gamma(y))

where Γ\Gamma is the graph of the definable function γ\gamma.


In the above, we can actually weaken the assumption that we start with a collection of (Syn(T)) 0(\mathsf{Syn}(T))_0-indexed maps to just starting with a collection of maps f C:M(C)N(C)f_C : M(C) \to N(C), where CC runs along the maximal objects (the types) in Syn(T)\mathsf{Syn}(T).

Then: the (Syn(T)) 0(\mathsf{Syn}(T))_0-indexed collection of pullbacks along the canonical embeddings N(X)N(C)N(X) \hookrightarrow N(C) is a natural transformation MNM \to N if and only if the maps f Cf_C preserved and reflected first-order logic.

Elementary embeddings between models of set theory

In material set theory

Elementary embeddings play an important role in the study of large cardinals in (material) set theory.

For instance, the existence of a measurable cardinal is equivalent to the existence of a non-surjective elementary embedding j:VMj\colon V\to M, where VV is the universe of sets and MM is some transitive model of ZF. If κ\kappa is a measurable cardinal with a countably-complete ultrafilter 𝒰\mathcal{U}, we can form the ultrapower V 𝒰V^{\mathcal{U}} and then take its transitive collapse? to produce MM. (Countable completeness of 𝒰\mathcal{U} is necessary for V 𝒰V^{\mathcal{U}} to be well-founded and thus have a transitive collapse.)

Conversely, if j:VMj\colon V\to M is a nontrivial elementary embedding, it must have a critical point, i.e. a least ordinal κ\kappa such that j(κ)κj(\kappa)\neq \kappa. It follows that j(κ)j(\kappa) is some ordinal >κ\gt \kappa, so in particular κj(κ)\kappa\in j(\kappa) (using the von Neumann definition of ordinals). Define 𝒰P(κ)\mathcal{U}\subset P(\kappa) by A𝒰A\in \mathcal{U} iff κj(A)\kappa\in j(A); then 𝒰\mathcal{U} is a κ\kappa-complete ultrafilter on κ\kappa.

Stronger large cardinal axioms can be characterized, or defined, as the critical points of elementary embeddings satisfying additional closure axioms on the transitive class MM.

In structural set theory

Any elementary embedding of models of ZF induces a conservative logical functor between their categories of sets. In fact, it is much more than that; a conservative logical functor preserves and reflects only first-order logic with bounded quantifiers, while an e.e. preserves and reflects all first-order logic.

The structural meaning of elementary embeddings seems not to be well-explored.


The “ultimate” closure property, hence the “strongest” large cardinal axiom, would be having a nontrivial elementary embedding j:VVj\colon V\to V (i.e. MM is all of VV). Sometimes the critical point of such an embedding, if one exists, is called a Reinhardt cardinal. However, having such an e.e. turns out to be inconsistent…sort of.

The technicality is that because any e.e. VVV\to V is a proper class, the proposition “there does not exist an e.e. VVV\to V” cannot be stated in ZF (one cannot quantify over proper classes). What we can prove is the following meta-theorem (one instance per formula φ(x,y)\varphi(x,y) that might define an e.e.).


For any formula φ(x,y,z)\varphi(x,y,z) and any set aa, it is not true that defining j a(x)=yφ(x,y,a)j_a(x)=y \iff \varphi(x,y,a) makes j aj_a into an elementary embedding VVV\to V.


Suppose that φ\varphi and aa exist. Fix such a φ\varphi. Fix λ\lambda as the smallest ordinal such that there exists an aV λa\in V_\lambda such that φ(,,a)\varphi(-,-,a) defines an e.e. VVV\to V. Now the statement “λ\lambda is the smallest ordinal such that there exists an aV λa\in V_\lambda such that φ(,,a)\varphi(-,-,a) defines an e.e. VVV\to V.” is definable in the language of ZF (definability of the property “is an e.e. VVV\to V” is tricky, but true). Therefore, if bb is any set such that j bj_b is an e.e., it preserves the truth of this, so it is also true that j b(λ)j_b(\lambda) is the smallest ordinal such that there exists an aV j b(λ)a\in V_{j_b(\lambda)} such that φ(,,a)\varphi(-,-,a) defines an e.e. VVV\to V. This clearly implies that j b(λ)=λj_b(\lambda)=\lambda.

Now define κ\kappa to be the smallest ordinal which is a critical point of an e.e. VVV\to V of the form j aj_a for some aV λa\in V_\lambda. Let bV λb\in V_\lambda be such that j bj_b is an e.e. VVV\to V and κ\kappa is the critical point of j bj_b. The definition of κ\kappa is again a definable property, so it follows that j b(κ)j_b(\kappa) is the smallest ordinal which is a critical point of an e.e. VVV\to V of the form j aj_a for some aV j b(λ)=V λa\in V_{j_b(\lambda)} = V_\lambda. Therefore, κ=j b(κ)\kappa= j_b(\kappa), a contradiction to κ\kappa being the critical point of j bj_b.

Now, if we work instead in a theory such as NBG or MK which can contain non-definable proper classes, in theory there might still be an e.e. VVV\to V which is not definable. One can also access such an idea by adding a new symbol “jj” to ZF and asserting that it is an e.e. However, it was shown by Kunen in 1971, using a technical combinatorial argument, that the existence of such an e.e. is inconsistent with the axiom of choice. It is unknown whether it is consistent with ZF.

Revised on May 28, 2017 05:25:08 by Urs Schreiber (