A better way to explain forcing?

I have proposed such an axiomatization. It is published in Comptes Rendus: Mathématique, which has returned to the Académie des Sciences in 2020 and is now completely open access. Here is a link:

https://doi.org/10.5802/crmath.97

The axiomatization I have proposed is as follows:

Let $(M, \mathbb P, R, \left\{\Vdash\phi : \phi\in L(\in)\right\}, C)$ be a quintuple such that:

  • $M$ is a transitive model of $ZFC$.

  • $\mathbb P$ is a partial ordering with maximum.

  • $R$ is a definable in $M$ and absolute ternary relation (the $\mathbb P$-membership relation, usually denoted by $M\models a\in_p b$).

  • $\Vdash\phi$ is, if $\phi$ is a formula with $n$ free variables, a definable $n+1$-ary predicate in $M$ called the forcing predicate corresponding to $\phi$.

  • $C$ is a predicate (the genericity predicate).

As usual, we use $G$ to denote a filter satisfying the genericity predicate $C$.

Assume that the following axioms hold:

(1) The downward closedness of forcing: Given a formula $\phi$, for all $\overline{a}$, $p$ and $q$, if $M\models (p\Vdash\phi)[\overline{a}]$ and $q\leq p$, then $M\models (q\Vdash\phi)[\overline{a}]$.

(2) The downward closedness of $\mathbb P$-membership: For all $p$, $q$, $a$ and $b$, if $M\models a\in_p b$ and $q\leq p$, then $M\models a\in_q b$.

(3) The well-foundedness axiom: The binary relation $\exists p; M\models a\in_p b$ is well-founded and well-founded in $M$. In particular, it is left-small in $M$, that is, $\left\{a : \exists p; M\models a\in_p b\right\}$ is a set in $M$.

(4) The generic existence axiom: For each $p\in \mathbb P$, there is a generic filter $G$ containing $p$ as an element.

Let $F_G$ denote the transitive collapse of the well-founded relation $\exists p\in G; M\models a\in_p b$.

(5) The canonical naming for individuals axiom: $\forall a\in M;\exists b\in M; \forall G; F_G(b)=a$.

(6) The canonical naming for $G$ axiom: $\exists c\in M;\forall G; F_G(c)= G$.

Let $M[G]$ denote the direct image of $M$ under $F_G$. The next two axioms are the fundamental duality that you have mentioned:

(7) $M[G]\models \phi[F_G(\overline{a})]$ iff $\exists p\in G; M\models (p\Vdash\phi)[\overline{a}]$, for all $\phi$, $\overline{a}$, $G$.

(8) $M\models (p\Vdash\phi)[\overline{a}]$ iff $\forall G\ni p; M[G]\models \phi[F_G(\overline{a})]$, for all $\phi$, $\overline{a}$, $p$.

Finally, the universality of $\mathbb P$-membership axiom.

(9) Given an individual $a$, if $a$ is a downward closed relation between individuals and conditions, then there is a $\mathbb P$-imitation $c$ of $a$, that is, $M\models b\in_p c$ iff $(b,p)\in a$, for all $b$ and $p$.

It follows that $(M, \mathbb P, R, \left\{\Vdash\phi : \phi\in L(\in)\right\}, C, G)$ represent a standard forcing-generic extension: The usual definitions of the forcing predicates can be recovered, the usual definition of genericity can also be recovered ($G$ intersects every dense set in $M$), $M[G]$ is a model of $ZFC$ determined by $M$ and $G$ and it is the least such model. (Axiom $(9)$ is used only in the proof that $M[G]$ is a model).


Great Question! Finally someone asks the simplest questions, which almost invariably are the real critical ones (if I cannot explain a great idea to an intelligent person in minutes, it simply means I do not understand it).

In this case, the idea is one of the greatest in modern history.

Let me start with a historical background: in the 90s I talked with Stan Tennenbaum about Forcing, hoping to (finally!) understand it (did not go too far) . Here is what he told me (not verbatim): during those times,late 50s and very early 60s, several folks were trying their hand to prove independence.

What did they know? They certainly knew that they had to add a set G to the minimal model, and then close up with respect to Godel constructibility operations. So far nothing mysterious: it is a bit like adding a complex number to Q and form an algebraic field.

First blocker: if I add a set G which certainly exists to construct the function you described above, how do I know that M[G] is still a model of ZF?

In algebraic number theory I do not have this issue, I simply take the new number , and throw it into the pot, but here I do. Sets carry along information, and some of this information can be devastating (simple example: suppose that G is gonna tell that the first ordinal outside of M is in fact reachable, that would be very bad news.

All this was known to the smart folks at the time. What they did not know is: very well, I am in a mine field, how then I select my G so it does not create trouble and do what is supposed to do? That is the fundamental question.

They wanted to find G, describe it, and then add it.

Enter Cohen. In a majestic feat of mathematical innovation, Cohen, rather than going into the mine field outside of M searching for the ideal G, enters M. He looks at the world outside, so to speak, from inside (I like to think of him looking at the starry sky, call it V, from his little M).

Rather than finding the mysterious G which floats freely in the hyperspace outside M, he says: ok, suppose I wanted to build G, brick by brick, inside M. After all, I know what is supposed to do for me, right? Problem is, I cannot, because if I could it would be constructible in M, and therefore part of M. Back to square one.

BUT: although G is not constructible in M, all its finite portions are, assuming such a G is available in the outer world. It does not exist in M, but the bricks which make it (in your example all the finite approximation of the function), all of them, are there. Moreover, these finite fragments can be partially ordered, just like little pieces of information: one is sometimes bigger than the other, etc

Of course this order is not total. So, he says, let us describe that partial order, call it P. P is INSIDE M, all of it. Cohen has the bricks, and he knows which brick fit others, to form some pieces of walls here and there, but not the full house, not G. Why? because the glue which attach these pieces all together in a coherent way is not there. M does not know about the glue. Cohen is almost done: he steps out of the model, and bingo! there is plenty of glue.

If I add an ultrafilter, it will assemble consistently all the pieces of information, and I have my model. I do not need to explicitly describe it, it is enough to know that the glue is real (outside). Now we go back to the last insight of Cohen. How does he know that glueing all pieces along the ultrafilter will not "mess things up"? Because, and the funny thing is M knows it, all information coming with G is already reached at some point of the glueing process, so it is available in M.

Finale

What I just said about the set of fragments of information, is entirely codable in M. M knows everything, except the glue. It even knows the "forcing relation", in other words it knows that IF M[G] exists, then truth in M[G] corresponds to some piece of information from within forcing it.

LAST NOTE One of my favorite books in Science Fiction was written by the set theorist converted to writer, Dr. Rudy Rucker. The book is called White Light, and is a big celebration of Cantorian Set Theory written by an insider. It just misses one pearl, the most glorious one: Forcing. Who knows, someone here, perhaps you, will write the sequel to White Light and show the splendor of Cohen's idea not only to "ordinary mathematicians" but to everybody...

ADDENDUM: SHELAH's LOGICAL DREAM (see commentary of Tim Chow)

Tim, you have no idea how many thoughts your fantastic post has generated in my mind in the last 20 hours. Shelah's dream can be made reality, but it ain't easy, though now at least I have some clue as to how to begin.

It is the "virus control method": suppose you take M and throw in some G which is living in the truncated V cone where M lives. Add G. The very moment you add it, you are forced to add all sets which are G-constructibles in alpha steps, where alpha is any ordinal in M. Now, let us say that the most lethal viral attack perpetrated by G is that one of these new sets is exactly alpha_0, the first ordinal not in M, in other words G or its definable sets code a well order of type alpha_0.

If one carries out the analysis I have just sketched, the conjecture would be that a G which does not cause any damage is a set which is as close as possible to be definable in M already, in some sense to be made precise,but that goes along Cohen's intuition, namely that although G is not M-constructible, all its fragments are.

If this plan can be implemented, it would show that forcing is indeed unique, unless.... unless some other crazy idea come into play


This is an expansion of David Roberts's comment. It may not be the sort of answer you thought you were looking for, but I think it is appropriate, among other reasons because it directly addresses your question

if there is a way to "axiomatize" these facts by saying what properties forcing must have.

In fact, modern mathematics has developed a powerful and general language for "axiomatizing properties that objects must have": the use of universal properties in category theory. In particular, universal properties give a precise and flexible way to say what it means to "freely" or "generically" add something to a structure.

For example, suppose we have a ring $R$ and we want to "generically" add a new element. The language of universal properties says that this should be a ring $R[x]$ equipped with a homomorphism $c:R\to R[x]$ and an element $x\in R[x]$ with the following universal property: for any ring $S$ equipped with a homomorphism $f:R\to S$ and an element $s\in S$, there exists a unique homomorphism $h:R[x]\to S$ such that $h\circ c = f$ and $h(x) = s$.

Note that this says nothing about how $R[x]$ might be constructed, or even whether it exists: it's only about how it behaves. But this behavior is sufficient to characterize $R[x]$ up to unique isomorphism, if it exists. And indeed it does exist, but to show this we have to give a construction: in this case we can of course use the ring of formal polynomials $a_n x^n + \cdots + a_1 x + a_0$.

From this perspective, if we want to add a function $F : \aleph_2\times \aleph_0 \to 2$ to a model $M$ of ZFC to obtain a new model $M[F]$, the correct thing to do would be to find a notion of "homomorphism of models" such that $M[F]$ can be characterized by a similar universal property: there would be a homomorphism $c:M\to M[F]$ and an $F : \aleph_2\times \aleph_0 \to 2$ in $M[F]$, such that for any model $N$ equipped with a homomorphism $f:M\to N$ and a $G : \aleph_2\times \aleph_0 \to 2$ in $N$, there is a unique homomorphism $h:M[F]\to N$ such that $h\circ c = f$ and $h(F) = G$.

The problem is that the usual phrasing of ZFC, in terms of a collection of things called "sets" with a membership relation $\in$ satisfying a list of axioms in the language of one-sorted first-order logic, is not conducive to defining such a notion of homomorphism. However, there is an equivalent formulation of ZFC, first given by Lawvere in 1964, that works much better for this purpose. (Amusingly, 1964 is exactly halfway between 1908, when Zermelo first proposed his list of axioms for set theory, and the current year 2020.) In Lawvere's formulation, there is a collection of things called "sets" (although they behave differently than the "sets" in the usual presentation of ZFC) and also a separate collection of things called "functions", which together form a category (i.e. functions have sets as domain and codomain, and can be composed), and satisfy a list of axioms written in the language of category theory. (A recent short introduction to Lawvere's theory is this article by Tom Leinster.)

Lawvere's theory is usually called "ETCS+R" (the "Elementary Theory of the Category of Sets with Replacement"), but I want to emphasize that it is really an entirely equivalent formulation of ZFC. That is, there is a bijection between models of ZFC, up to isomorphism, and models of ETCS+R, up to equivalence of categories. In one direction this is exceedingly simple: given a model of ZFC, the sets and functions therein as usually defined form a model of ETCS+R. Constructing an inverse bijection is more complicated, but the basic idea is the Mostowski collapse lemma: well-founded extensional relations can be defined in ETCS+R, and the relations of this sort in any model of ETCS+R form a model of ZFC.

Since a model of ETCS+R is a structured category, there is a straightforward notion of morphism between models: a functor that preserves all the specified structure. However, this notion of morphism has two defects.

The first is that the resulting category of models of ETCS+R is ill-behaved. In particular, the sort of "free constructions" we are interested in do not exist in it! However, this is a problem of a sort that is familiar in modern structural mathematics: when a category is ill-behaved, often it is because we have imposed too many "niceness" restrictions on its objects, and we can recover a better-behaved category by including more "ill-behaved" objects. For instance, the category of manifolds does not have all limits and colimits, but it sits inside various categories of more general "smooth spaces" that do. The same thing happens here: by dropping two of the axioms of ETCS+R we obtain the notion of an elementary topos, and the category of elementary toposes, with functors that preserve all their structure (called "logical functors"), is much better-behaved. In particular, we can "freely adjoin a new object/morphism" to an elementary topos.

(I am eliding here the issue of the replacement/collection axiom, which is trickier to treat correctly for general elementary toposes. But since my main point is that this direction is a blind alley for the purposes of forcing anyway, it doesn't matter.)

The second problem, however, is that these free constructions of elementary toposes do not have very explicit descriptions. This is important because our goal is not merely to freely adjoin an $F:\aleph_2\times \aleph_0 \to 2$, but to show that the existence of such an $F$ is consistent, and for this purpose we need to know that when we freely adjoin such an $F$ the result is nontrivial. Thus, in addition to characterizing $M[F]$ by a universal property, we need some concrete construction of it that we can inspect to deduce its nontriviality.

This problem is solved by imposing a different niceness condition on the objects of our category and changing the notion of morphism. A Grothendieck topos is an elementary topos that, as a category, is complete and cocomplete and has a small generating set. But, as shown by Giraud's famous theorem, it can equivalently be defined as a cocomplete category with finite limits and a small generating set where the finite limits and small colimits colimits interact nicely. This suggests a different notion of morphism between Grothendieck toposes: a functor preserving finite limits and small colimits. Let's call such a functor a Giraud homomorphism (it's the same as a "geometric morphism", but pointing in the opposite direction).

The category of Grothendieck toposes and Giraud homomorphisms is well-behaved, and in particular we can freely adjoin all sorts of structures to a Grothendieck topos -- specifically, any structure definable in terms of finite limits and arbitrary colimits (called "a model of a geometric theory"). (To be precise, this is a 2-category rather than a category, and the universal properties are up to isomorphism, but this is a detail, and unsurprising given the modern understanding of abstract mathematics.) Moreover, the topos $M[G]$ obtained by freely adjoining a model $G$ of some geometric theory to a Grothendieck topos $M$ -- called the classifying topos of the theory of $G$ -- has an explicit description in terms of $M$-valued "sheaves" on the syntax of the theory of $G$. This description allows us to check, in any particular case, that it is nontrivial. But for other purposes, it suffices to know the universal property of $M[G]$. In this sense, the universal property of a classifying topos is an answer to your question:

when I say that forcing "must have" these properties, I mean that by using these axioms, we can go ahead and prove that $M[G]$ satisfies ZFC, and only worry later about how to construct something that satisfies the axioms.

Only one thing is missing: not every Grothendieck topos is a model of ETCS+R, hence $M[G]$ may not itself directly yield a model of ZFC. We solve this in three steps. First, since ZFC satisfies classical logic rather than intuitionistic logic (the natural logic of categories), we force $M[G]$ to become Boolean. Second, by restricting to "propositional" geometric theories we ensure that the result also satisfies the axiom of choice. Finally, we pass to the "internal logic" of the topos, which is to say that we allow "truth values" lying in its subobject classifier rather than in the global poset of truth values $2$. We thereby get an "internal" model of ETCS+R, and hence also an "internal" model of ZFC.

So where does the complicated machinery in the usual presentation of forcing come from? Mostly, it comes from "beta-reducing" this abstract picture, writing out explicitly the meaning of "well-founded extensional relation internal to Boolean sheaves on the syntax of a propositional geometric theory". The syntax of a propositional geometric theory yields, as its Lindenbaum algebra, a poset. The Boolean sheaves on that poset are, roughly, those that satisfy the usual "denseness" condition in forcing. The "internal logic" valued in the subobject classifier corresponds to the forcing relation over the poset. And the construction of well-founded extensional relations translates to the recursive construction of "names".

(Side note: this yields the "Boolean-valued models" presentation of forcing. The other version, where we take $M$ to be countable inside some larger model of ZFC and $G$ to be an actual generic filter living in that larger model, is, at least to first approximation, an unnecessary complication. By comparison (and in jesting reference to Asaf's answer), if we want to adjoin a new transcendental to the field $\mathbb{Q}$, we can simply construct the field of rational functions $\mathbb{Q}(x)$. From the perspective of modern structural mathematics, all we care about are the intrinsic properties of $\mathbb{Q}(x)$; it's irrelevant whether it happens to be embeddable in some given larger field like $\mathbb{R}$ by setting $x=\pi$.)

The final point is that it's not necessary to do this beta-reduction. As usual in mathematics, we get a clearer conceptual picture, and have less work to do, when working at an appropriate level of abstraction. We prove the equivalence of ZFC and ETCS+R once, abstractly. Similarly, we show that we have an "internal" model of ETCS+R in any Grothendieck topos. These proofs are easier to write and understand in category-theoretic language, using the intrinsic characterization of Grothendieck toposes rather than anything to do with sites or sheaves. With that done, the work of forcing for a specific geometric theory is reduced to understanding the relevant properties of its category of Boolean sheaves, which are simple algebraic structures.