Skip to article frontmatterSkip to article content

You may have noticed that, when introducing the axioms of ZFC\ZFC, we never really answered the question “What is a set?” Instead, we developed a formal theory of axioms for a binary relation that somehow describe “how sets work”, that is, how we can obtain sets from given ones using well-known operations like power set and union.

We have then seen how we can develop a lot of standard mathematical objects (like N\Nat, R\Real) and techniques (like induction and definition by recursion) inside this formal system. In fact, most of mathematics can be developed formally inside this system. Almost all proofs you find in any standard math book are proofs that can be formalized in ZFC\ZFC. It is very tedious to do this for us humans, but there is little doubt it can be done, and in fact, looking at the recent work on proof assistants (like Coq or Lean), many parts of mathematics have been formalized (albeit not directly in ZFC\ZFC).

This expressiveness gives ZFC\ZFC its foundational importance, but it is also the cause for much confusion for someone who first studies set theory.

From a pedagogical point of view, in what follows it is helpful to assume a “Platonist” perspective of mathematics, and set theory in particular, namely that sets and the relations between exist independently (and outside) of the ZFC\ZFC axioms. The set of real numbers exists, and our development of R\Real inside ZFC\ZFC is just a formal way to describe them. From this perspective, the axioms of ZF\ZF (AC\AC is a little different) are just obvious truths about sets, just like the Peano axioms are obvious truths about natural numbers.

Among other things, this perspective allows us to treat ZF\ZF just like any other mathematical theory, like group theory or the theory of algebraically closed fields. In particular, we can think about models of set theory the way we would think about models of group theory, in the sense of model theory.

A model would simply be a set MM together with a binary relation EE on SS such that

(M,E)ZF,(M,E) \models \ZF,

that is, all axioms hold when interpreted in (M,E)(M,E). Note that we use “set” in this context not in the formal sense, but in the “meta”-sense (the Platonist world of sets).

Working in the meta-theory (“that what is mathematically true”), we know by Gödel’s completeness theorem that

if ZF\ZF is consistent, then it has a model.

This model should be seen as a set-theoretic universe: Its elements can be seen as sets, and the interpretation EE of the \in-symbol will tell us how these sets are connected via the element-relation.

Note that EE does not have to be the actual element relation on a set (of sets), but just some binary relation so that the axioms are satisfied.

In the meta-world, there are, of course, sets other than MM, but that does not matter here, since all we are interested in is giving some universe in which our axioms hold. (Timothy Chow has suggested that set theory should rather be called "universe theory. He is right in the sense that what axiomatic set theory does is to define such universes of sets, rather than what a set is.)

In the meta-theory, we can then follow the usual techniques to show provability or non-provability results.

If we want to prove that CH\CH is consistent with ZF\ZF (assuming ZF\ZF is consistent), we need to find a model in which both hold.

One difficulty in working with models of set theory is that they can look very different depending on whether you look at a model “from the inside” or “from the outside”.

To illustrate this, assume ZF\ZF is consistent. Then, by the Löwenheim-Skolem theorem, there exists a countable model for ZF\ZF. Yet it is a theorem of ZF\ZF that there exists an uncountable set. This is often referred to as Skolem’s paradox, although it is not really an antinomy.

If we break this down a bit, we see that the apparent paradox is really just a matter of perspective (inside or outside). Assume (M,E)(M,E) is a countable model of ZF\ZF. Then there exists xMx \in M such that there is no injection from xx to the natural numbers. Since MM is countable, xx can have at most countably many elements. So why is this not a contradiction? We should really read the statement above as

there is no injection in MM from xx to MM’s version of the natural numbers.

In other words, even though xx is countable from the outside, xx appears uncountable inside MM since a mapping witnessing its countability does not exist in MM.

This is a first warning sign that models of ZF\ZF can behave in very unexpected ways. For another example, recall the axiom of Foundation asserts that the \in-relation is well-founded. But again, this means only “from the inside”.

Proof

Introduce new constant symbols cnc_n (nN)(n \in \Nat) and add the formulas φncn+1cn\varphi_n \equiv c_{n+1} \in c_n to the axioms of ZF\ZF. It is not hard to show, using the compactness theorem, that ZF+nφn\ZF + \bigcup_n \varphi_n has a model (M,E)(M^*, E^*), for which the set {cn ⁣:nN}\{c_n \colon n \in \Nat\} is ill-founded.

Since, as mentioned above, the model (M,E)(M^*,E^*) satisfies Foundation, the set {cn ⁣:nN}\{c_n \colon n \in \Nat\} is actually not in the model (and neither can be any other set with an infinite descending \in-chain).

Mostowski collapse

If we restrict ourselves to models on which the EE-relation is actually well-founded (i.e. from the outside), then interestingly these models look, in a way, “natural”: They can be assumed to be the \in-relation on a set. Such models are also called standard.

Given a set theoretic structure (M,E)(M,E) (not necessarily a model of ZF\ZF), for each xMx \in M let

extE(x)={yX ⁣:yEx}\Op{ext}_E(x) = \{ y \in X \colon y\, E \, x \}

If EE behaves “set-like”, then it will respect the axiom of Extensionality, i.e. two sets are identical if and only if they have the same elements. Therefore we say that EE is extensional if

x,zX,  xz implies extE(x)extE(z).x,z \in X, \; x\neq z \quad \text{ implies } \quad \Op{ext}_E(x) \neq \Op{ext}_E(z).

Furthermore, as stated above, we want to exclude infinite descending EE-chains. Recall we say that EE is well-founded if

every non-empty set YXY \subseteq X has an EE-minimal element.

Proof

We construct π{}\pi and S=im(π)S = \Op{im}(\pi) by recursion on EE, which is possible since it is well-founded.

For each xXx \in X, let

π(x)={π(y) ⁣:yEx},\pi(x) = \{\pi(y) \colon y \, E \, x \},

and set S=ran(π)S = \Op{ran}(\pi).

The injectivity of π{}\pi follows from the extensionality of π{}\pi by induction along EE: Suppose we have shown

z  (zExyX(π(z)=π(y)z=y)).\forall z \; (z E x \to \forall y \in X (\pi(z) = \pi(y) \to z = y)).

and we have to show that it holds for xx. Assume π(x)=π(y)\pi(x) = \pi(y) for some yXy \in X. Then

cExπ(c)π(x)=π(y)π(c)=π(z) for some zEyc=z(by ind. hyp., since cEx)cEy.\begin{align*} cEx &\Rightarrow& \pi(c) \in \pi(x) = \pi(y) &\\ &\Rightarrow& \pi(c) = \pi(z) & \qquad \text{ for some } zEy\\ &\Rightarrow& c=z & \qquad (\text{by ind. hyp., since } cEx)\\ &\Rightarrow& cEy &. \end{align*}

Similarly, we get cEycExcEy \Rightarrow cEx, hence x=yx=y as desired due to the extensionality of EE. Finally we have

π(x)π(y)π(x)=π(c) for some cEyx=c (since π is injective)xEy\begin{align*} \pi(x) \in \pi(y) & \Rightarrow & \pi(x) = \pi(c) & \qquad \text{ for some } cEy \\ &\Rightarrow& x = c & \qquad \text{ (since ${}\pi$ is injective)}\\ &\Rightarrow& xEy & \end{align*}

Thus π{}\pi is an isomorphism.

To see the uniqueness of π{}\pi and SS, assume ρ\rho, TT are such that the statement of the theorem is satisfied. Then πρ1\pi \circ \rho^{-1} is an isomorphism between (T,)(T, \in) and (S,)(S,\in). Now apply the following lemma.

Proof

By induction on the well-founded relation \in. Assume that θ(z)=z\theta(z)=z for all zxz \in x and let y=θ(x)y = \theta(x).

We have xyx \subseteq y because if zxz \in x, then z=θ(z)θ(x)=yz = \theta(z) \in \theta(x) = y.

We also have yxy \subseteq x: Let tyt \in y. Since yYy \in Y, there is zXz \in X with θ(z)=t\theta(z) = t. Since θ(z)y\theta(z) \in y and y=θ(x)y = \theta(x), we have zxz \in x, and thus t=θ(z)=zxt = \theta(z) = z \in x.

Hence x=yx = y, and this also implies θ(x)=x\theta(x) = x.

Proof

Every linear order is extensional. Hence we can apply the Mostowski Collapse. It is easy to see that the resulting set is an ordinal.

Absoluteness and transitive models

Even if we consider well-founded standard models, interpreting set-theoretic statements in them can still lead to very different results, even for very simple formulas.

In the example above, the set MM is not transitive, which allowed it to “hide” its elements. It turns out that if we require our model to be transitive, the truth of simple formulas cannot vary between the “inside” and the “outside” perspective. We call such properties absolute.

Given a formula φ{}\varphi and some class MM, we can relativize φ{}\varphi with respect to MM essentially by restricting all quantifiers in φ{}\varphi to range over MM, i.e. x\exists x becomes xM\exists x \in M and x\forall x becomes xM\forall x \in M. Note that classes are defined by formulas, so the resulting formula, which we denote by φM\varphi^M, is still a formula of set theory.

Proof (Sketch)

Clearly x=yx=y and xyx\in y are absolute for any MM. It is also not hard to see that if φ{}\varphi and ψ{}\psi are absolute for MM, then so are ¬φ\neg \varphi and φψ\varphi \wedge \psi. Hence all quantifier free formulas are absolute.

Finally, if φ{}\varphi is absolute for MM, so is ψxyφ \psi \equiv \exists x \in y \: \varphi: If ψM(y,zˉ)\psi^M(y,\bar{z}) holds for y,zˉMy,\bar{z} \in M, then we have [x(xyφ(x,y,zˉ))]M[\exists x (x \in y \: \wedge \: \varphi(x,y,\bar{z}))]^M, i.e., xM(xyφM(x,y,zˉ))\exists x \in M(x \in y \: \wedge \: \varphi^M(x,y,\bar{z})). Since φM(x,y,zˉ)\varphi^M(x,y,\bar{z}) if and only if φ(x,y,zˉ)\varphi(x,y,\bar{z}), it follows that xyφ(x,y,zˉ)\exists x\in y \varphi(x,y,\bar{z}), i.e. ψ{}\psi.

Conversely, if for y,zˉMy,\bar{z} \in M, xyφ(x,y,zˉ)\exists x\in y \varphi(x,y,\bar{z}), then since MM is transitive, xx belongs to MM, and since φ(x,y,zˉ)\varphi(x,y,\bar{z}) if and only φM(x,y,zˉ)\varphi^M(x,y,\bar{z}), we have xM(xyφM(x,y,zˉ))\exists x \in M \, (x \in y \: \wedge \: \varphi^M(x,y,\bar{z}) ) and so ψM(y,zˉ)\psi^M(y,\bar{z}).

We leave the proof as an exercise.

We can extend the previous absoluteness result to a slightly larger family of formulas.

The absoluteness property of “simple” formulas can be put to use to identify many “almost” models of ZF\ZF.

Proof (Sketch)

We verify a few axioms and leave the rest as an exercise.

Extensionality: The relativized version of this axiom is

a,bVα(xVα(xaxb)a=b)\forall a,b \in V_\alpha \, ( \forall x\in V_\alpha (x \in a \leftrightarrow x \in b) \to a=b)

This simply states that the \in-relation (as a binary relation) is extensional on VαV_\alpha (as defined above). It is easy to see that for every transitive set, the \in-relation is extensional.

Power Set: We have seen that \subseteq is absolute for transitive classes. Therefore, the relativized version of the Power Set axiom becomes

aVαyVαzVα(zyza)\forall a \in V_\alpha\, \exists y \in V_\alpha \forall z \in V_\alpha (z \in y \leftrightarrow z \subseteq a)

This means we only need to have those subsets in the power set that are in VαV_\alpha. In other words, we need to verify

aVα(P(a)VαVα)\forall a \in V_\alpha\, ( \Pow(a) \cap V_\alpha \in V_\alpha)

If aVαa \in V_\alpha, then aVβa \in V_\beta for some β<α\beta < \alpha, since α{}\alpha is limit. Since VβV_\beta is transitive, aVβa \subseteq V_\beta and thus P(a)P(Vβ)=Vβ+1\Pow(a)\subseteq \Pow(V_\beta) = V_{\beta+1}, which in turn yields P(a)Vβ+2\Pow(a) \in V_{\beta+2}

If bP(a)Vαb \in \Pow(a) \cap V_\alpha, then bVβ+2b \in V_{\beta+2}. Therefore, P(a)VαVβ+2\Pow(a) \cap V_\alpha \subseteq V_{\beta+2}. It follows that P(a)VαVβ+3Vα\Pow(a) \cap V_\alpha \in V_{\beta+3} \subseteq V_\alpha, as desired.

Infinity: We have seen that x=ωx = \omega is absolute for transitive classes. Hence VαV_\alpha will satisfy the Axiom of Infinity if ωVα\omega \in V_\alpha. But this is true since we assume α>ω\alpha > \omega.

What is the problem with Replacement? The axiom says that if φ{}\varphi defines a function, then the image of any set under this function is a set. Relativized to some MM, this means we need to find a set in MM that contains the image. Here we run into a cardinality problem. The cardinality of VαV_\alpha can be much bigger than the cardinality of α{}\alpha. For example, P(ω)Vω+ω\Pow(\omega) \in V_{\omega+\omega} but ω+ω\omega+\omega is countable. We could have a function that maps P(ω)\Pow(\omega) to sets of rank cofinal in ω+ω\omega+\omega, which implies that the image cannot be in Vω+ωV_{\omega+\omega}.

For VαV_\alpha to be a model, we would need α{}\alpha to be “unreachable” by such mappings. This gives rise to the notion of inaccessible cardinals, which we will introduce in the next section.