model theory, Skolem paradox, Ramsey theorem, Loewenheim, categorical, Ramsey, Skolem, G�del, completeness theorem, categoricity, Goedel, theorem, completeness, Godel
Back to title page.
Some widespread Platonist superstitions were derived from other important results of mathematical logic (omitted in the main text of this book): Goedel's completeness theorem for predicate calculus, Loewenheim-Skolem theorem, the categoricity theorem of second order Peano axioms. In this short Appendix I will discuss these results and their methodological consequences (or lack of them).
All these results have been obtained by means of the so-called model theory. This is a very specific approach to investigation of formal theories as mathematical objects. Model theory is using the full power of set theory. Its results and proofs can be formalized in ZFC. Model theory is investigation of formal theories in the metatheory ZFC.
The main structures of model theory are interpretations. Let L be the language of some (first order) formal theory containing constant letters c1, ..., ck, function letters f1, ..., fm, and predicate letters p1, ..., pn. An interpretation J of the language L consists of the following objects:
a) a non-empty set DJ - the domain of interpretation (it will serve as the range of variables),
b) a mapping intJ that assigns:
- with each constant letter ci - a member intJ(ci) of the domain DJ,
- with each function letter fi - a function intJ(fi) from DJ x ... x DJ into DJ (of course, intJ(fi) has the same number of arguments as fi),
- with each predicate letter pi - a predicate intJ(pi) on DJ, i.e. a subset of DJ x ... x DJ (of course, intJ(pi) has the same number of arguments as pi).
As an example, let us consider the so-called standard interpretation S of Peano arithmetic PA:
a) The domain is DS = {0, 1, 2, ...} (the set w in terms of ZF).
b) The mapping intS assigns: with the constant 0 - the number 0 (the empty set o), with the constant 1 - the number 1 (the set {o}), with the function letter "+" - the function x+y (addition of natural numbers), with the function letter "*" - the function x*y (multiplication of natural numbers), with the predicate letter "=" - the predicate x=y (equality of natural numbers).
Having an interpretation J of the language L, we can define the notion of true formula (more precisely - the notion of formulas that are true under the interpretation J).
As the first step, terms of the language L are interpreted as members of DJ or functions over DJ. Terms are defined as constant letters, or variable letters, or their combinations by means of function letters. The term ci is interpreted as the member intJ(ci) of DJ. The variable xi is interpreted as the function Xi(xi) = xi. And, if t = fi(t1, ..., tq), then intJ(t) is defined as the function obtained by substituting of functions intJ(t1), ..., intJ(tq) into the function intJ(fi). For example, the standard interpretation of the term (x+y+1)*(x+y+1) is the function (x+y+1)2.
As the next step, the notion of true atomic formulas is defined. Of course, if a formula contains variables (as, for example, the formula x=1), then its "truth value" must be defined for each combination of values of these variables. Thus, to obtain the truth value of the formula pi(t1, ..., tq) for some fixed values of the variables contained in t1, .., tq, we must first "compute" the values of these terms, and then substitute these values into the predicate intJ(pi).
Note. The equality letter "=" is always interpreted in standard way - as the equality of members of DJ.
And finally, we can "define" the notion of true compound formulas of the language L under the interpretation J (of course, for a fixed combination of values of their free variables):
a) Truth-values of the formulas ~B, B&C, BvC and B->C can be computed from the truth values of B and C.
b) The formula AxB(x) is true, iff B(c) is true for all members c of the domain DJ.
c) The formula ExB(x) is true, iff there is a member c of the domain DJ such that B(c) is true.
Note that for an infinite domain DJ this notion of truth is extremely non-constructive: to establish, for example, truth-value of the formula AxB(x), we must check truth of B(c) for infinitely many values of c. The "degree of constructivity" of the formulas like AxEyC(x,y), AxEyAzD(x,y,z) etc. is even less... (compare my "critique" of the notion of true arithmetical formula in Section 3.1).
Let us say that a formula of the language L is true under the interpretation J, iff this formula is true for all combinations of values of its free variables.
Some formulas are true for all interpretations, for example:
(B->C) & (C->D) -> (B->D),
Ax(C->D(x)) -> (C->AxD(x)),
where C does not contain x. Such formulas are called logically valid (because they are true independently of the interpretation of their "meaning"). Note that the notion of logically valid formula is doubly non-constructive in the sense that the universal quantifier "for all interpretations" is added to the (already) non-constructive definition of (simply) true formula.
You could check easily that: a) all the logical axioms (see Appendix 3) are logically valid, b) the logical inference rules (see Appendix 3) allow to prove (from logically valid formulas) only logically valid formulas. Hence, in this way only logically valid formulas can be proved. Still, is our list of logical axioms complete in the sense that all logically valid formulas can be proved? The answer is "yes" - as Kurt Goedel established in 1929 (i.e. just a year BEFORE...):
K.Goedel. Die Vollstaendigkeit der Axiome des logischen Funktionenkalkuels. "Monatshefte fuer Mathematik und Physik", 1930, Vol.37, pp.349-360.
Goedel's Completeness Theorem. A formula (in any first order language) is logically valid, iff it can be proved by using the logical axioms and inference rules.
Unexpectedly, this theorem is an easy corollary of the so-called Model Existence theorem.
If T is a formal theory, and J is an interpretation of its language, then (traditionally) J is called a model of T, iff J makes true all axioms (and hence, all theorems) of T. Such notion of model seems somewhat strange: in "normal" branches of science theories serve as basis for building models of natural phenomena or technical devices, yet in mathematical logic we have turned this approach upside down.
Model Existence Theorem. If a (first order) formal theory is consistent (in the sense that it does not contain contradictions), then it has a finite or countable model (i.e. a model in which the domain is finite or countable).
This theorem solved a serious mental problem of anti-formalists. They thought that mere consistency of a theory (in the syntactic sense of the word - as the lack of contradictions) is not sufficient to regard theory as "meaningful". Model Existence theorem says that (syntactic) consistency of a theory is sufficient to regard it as "meaningful": if a theory does not contain contradictions, then it describes at least some kind of "mathematical reality".
See Mendelson [1997] for an elegant proof of Model Existence theorem (due to L.Henkin and G.Hasenjaeger, see also http://theory.lcs.mit.edu/~dmjones/hbp/bsyml/Authors/henkinleon.html). Or do the Exercise A.1.1 below.
Proof of the Completeness theorem. Of course, the only non-trivial part of the work is proving that each logically valid formula in the (first order) language L can be proved by using the logical axioms and inference rules (of the Appendix 3).
Let us assume that some formula F in the language L is logically valid, yet it cannot be proved by using our axioms and rules. Let us consider the theory T in the language L which has (besides the logical axioms) only one specific axiom - the formula ~F. Since F cannot be derived from logical axioms, T is a consistent theory. Hence, by Model Existence theorem, T has a model, i.e. an interpretation J that makes all its axioms true. Under this interpretation the formula ~F (as an axiom of T) is true. On the other hand, since F is logically valid, it is true under all interpretations, i.e. it is true also under the interpretation J. Hence, formulas F and ~F both are true under J. This is impossible, hence, F must be provable from logical axioms. Q.E.D.
Such a simple proof seems almost impossible! We are proving that the logical axioms and rules of inference are strong enough, but where come these axioms in? They come in - in the proof of Model Existence theorem: this theorem says that if some formal theory T does not have models, then the logical axioms and rules of inference are strong enough to derive a contradiction from the axioms of T.
Corollary. In any first order language the set of all logically valid formulas is effectively enumerable. I.e. given a language L, we can write a computer program that (working ad infinitum) prints out all logically valid formulas of L.
This makes Goedel's completeness theorem very significant: it shows that the "doubly non-constructive" notion of logically valid formula is at least 50% constructive!
Still, unfortunately, this notion is not 100% constructive. In 1936, Alonzo Church proved that at least some first order languages do not have algorithm determining, is a given formula logically valid or not (i.e. an algorithm solving the famous Entscheidungsproblem - decision problem):
A.Church. A note on the Entscheidungsproblem. "Journal of Symb. Logic", 1936, vol.1, pp.40-41
After this, L.Kalmar proved that none of serious first order languages can have such an algorithm (languages of PA and ZF included):
L.Kalmar. Die Zurueckfuehrung des Entscheidungsproblems auf den Fall von Formeln mit einer einzigen, binaeren Funktionsvariablen. "Compositio Math.", 1936, Vol.4, pp.137-144
For details, see Mendelson [1997].
Initially, Model Existence theorem was proved in a weaker form in 1915 (by Leopold Loewenheim) and 1919 (by Thoralf Skolem): if a first order theory has a model, then it has a finite or countable model (the famous Loewenheim-Skolem theorem). Proof (1929): if T has a model, then T is consistent, i.e. T has a finite or countable model.
L.Loewenheim. Ueber Moeglichkeiten im Relativkalkuel. "Mathematische Annalen", 1915, Vol.76, pp.447-470 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/lowen/lowen.html)
T.Skolem. Logisch-kombinatorische Untersuchungen ueber Erfuellbarkeit oder Beweisbarkeit mathematischer Saetze nebst einem Theoreme ueber dichte Mengen. "Skrifter utgit av Videnskapsselskapet in Kristiania, I, Mat.-Nat. Kl.", 1919, N4, pp.1-36 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/skolem/skolem.html)
Model Existence theorem is steadily provoking the so-called Skolem's paradox. Indeed, in ZF we can prove existence of uncountable sets. Still, according to Model Existence theorem, if ZF is consistent, then there is a countable model of ZF. I.e. ZF proves existence of uncountable sets, yet it has a countable model! Is this possible? Does it mean that ZF is inconsistent? Platonists could say even more: any consistent axiomatic set theory has countable models, hence, no axiom system can represent the "intended" set theory (i.e. "the" Platonist "world of sets") adequately.
For a formalist, Skolem's paradox is not a paradox at all. I would rather call it Skolem's effect - like as photo-effect, it is simply a striking phenomenon. Indeed, let J be a countable model of ZF. In ZF we can prove that the set r of all real numbers is uncountable, i.e.
~Ef (f is 1-1 function from r into w), -----------(1)
where w is the set of all natural numbers. What is the meaning of this theorem in the countable model J? Interpretations of r and w are subsets of the domain DJ, i.e. they both are countable sets, i.e.
Ef (f is 1-1 function from rJ into wJ). ----------(2)
Interpretation of (1) in J is
~Ef((f in DJ) & (f is 1-1 function from rJ into wJ)).
Hence, the mapping f of (2) does exist, yet it exists outside the model J!. Do you think that f of (2) "must" be located in the model? Why? If you are living (as an "internal observer") within the model J, the set rJ seems uncountable to you (because you cannot find a 1-1 function from rJ into wJ in your world J). Still, for me (an an "external observer") your uncountable rJ is countable - in my world I have a 1-1 function from rJ into wJ!
Hence, indeed, Skolem's paradox represents simply a striking phenomenon. It is worth of knowing, yet there is no danger in it.
Another Platonist superstition is connected with the so-called categoricity theorem of second order Peano axioms. By second order Peano axioms I mean the initial variant of axioms of arithmetic proposed by R.Dedekind. Modern version of this system is represented in the axioms P1, P2 and P3 of Section 3.1. The notion of models for these axioms can be discussed comfortably within ZF as a metatheory. Namely, any such model (according to our general definition above in this Appendix) is a triple (v, q, s), where v is a set (its members represent "natural numbers" of the model), q is a member of v (it represents the number 0), and s(x) is a function from v into v (it represents the function x+1). The triple (v, q, s) is a model of Peano axioms P1, P2, P3, iff
P1: ~(s(x)=q) for all x in v.
P2: If ~(x=y), then ~(s(x)=s(y)) for all x, y in v.
P3: If u is a subset of v, then
(q in u) & Ay(y in u -> s(y) in u) -> u=v.
Of course, the set w of all "set-theoretical" natural numbers (see Section 2.3), together with the empty set (representing the number 0) and the function xU{x} (representing s(x)) is a model of Peano axioms. This model is called traditionally the standard model of arithmetic. Let us say, that some other model (v, q, s) is isomorphic with the standard model, iff there is a 1-1 function f from w onto v ("onto v" means that range(f)=v) such that:
a) f(o) = q.
b) f(nU{n}) = s(f(n)) for all n in w.
The following theorem can be proved in ZF:
Categoricity Theorem. Any model of second order Peano axioms is isomorphic with the standard model.
This theorem was first proved by R.Dedekind (the author of Peano axioms, see Section 3.1) in his remarkable book:
R. Dedekind. Was sind und was sollen die Zahlen. Braunschweig, 1888 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/dedek/dedek.html)
Proof. Assuming that the axioms P1, P2, P3 are true in the model (v, q, s), let us define by recursion the following function from w into v:
f(o) = q, f({o}) = s(q), f({o,{o}}) = s(s(q)), ..., f(nU{n}) = s(f(n)), ...
Let us prove that f is the required isomorphism.
a) f(o) = q by definition.
b) f(nU{n}) = s(f(n)) for all n in w - also by definition.
c) Let us show that range(f)=v. Of course, q is in range(f), and if some x is in range(f) (i.e. x=f(n) for some n), then s(x)=s(f(n))=f(nU{n}), i.e. s(x) also is in range(f). Hence, by P3 (this axiom is true in the model (v, q, s)) we obtain that range(f)=v.
d) Let us show that f is 1-1 function, i.e. let us prove that, if f(m)=f(n), then m=n. We must consider three cases:
d1) m=n=o. Q.E.D.
d2) m=o, n>o. Then f(m)=q, but f(n)=s(f(n-1)), i.e. f(n) is not q by the axiom P1. Q.E.D.
d3) m>o, n>o. Then f(m)=s(f(m-1))=f(n)=s(f(n-1)), and by the axiom P2 we obtain that f(m-1)=f(n-1). Let us repeat this argument enough times, and we will have the case d1) or d2 at the end. Q.E.D.
Q.E.D.
Thus, it seems that the second order Peano axioms contain an "unambiguous definition" of the structure of their models. For this reason, sometimes, Categoricity theorem is considered as an additional evidence in favor of the Platonist opinion that natural numbers exist as a specific "world" where each definite assertion "must be" either true or false. Still, note that Categoricity theorem is a theorem of ZF. How could a theorem of ZF have "super-natural" consequences?
Exercise A.1.1. (for smart students). Prove Model Existence theorem by using the following smart ideas (see Mendelson [1997]). Let T be a consistent theory. We must build a model of T, what kind of "bricks" should we use for this "building"? Idea #1: let us use language constant letters! So, let us add to the language of T an infinite set of new constant letters b1, b2, b3, ... (and modify the logical axioms accordingly). Prove that this extended theory T0 is consistent. The model we are building must contain all "objects" whose existence can be proved in T0. Idea #2: for each formula F of T0 having exactly one free variable (for example, x) let us add to the theory T0 an axiom ExF(x)->F(bi), where the constant bi is unique for each F. If T0 proves ExF(x), then this bi will represent in our model the "object" x having the property F. Prove that this extended theory T1 is consistent. Idea #3: prove the (non-constructive) Lindenbaum's lemma: any consistent theory has a consistent complete extension (the axiom set of the extension may not be effectively solvable). After this, extend T1 to a consistent complete theory T2. Idea #4: let us take as the domain of the interpretation M the set of all those terms of T0 that do not contain variables. And let us interpret a function letter f as the "syntactic constructor function" f', i.e. let define the value f'(t1, ..., tn) simply as the character string "f(t1, ..., tn)". Finally, let us interpret a predicate letter p as the relation p' such that p'(t1, ..., tn) is true in M, iff T2 proves p'(t1, ..., tn). To complete the proof, prove that an arbitrary formula G is true in M, iff T2 proves G. Hence, all theorems of the initial theory T are true in M.
(Adolf Lindenbaum - logico, collaboratore e amico di Alfred Tarski - according to Famous mathematicians.)
The attitude of many working mathematicians to Goedel's incompleteness theorem is generally indifferent. Some methodological basis for such a position is given in the following quote from Parikh [1971]:
"... Thus exponentiation is not only a means for denoting "large numbers" but also the means for introducing "nonmathematical" questions into number theory. Why do we say "nonmathematical"? Because consider Goedel's formula A which says "I am not provable". Now this formula does express properties of N, since it can be written with quantifiers and connectives. However to see that A is true but not provable, we do not use properties of N, but properties of the intuitive notion "provable". Thus to say that A is a statement about numbers is like arguing that human behaviour is a problem of physics since human beings are physical entities. Even if such an assertion is true, it is very theoretical and not very useful."
Since human beings are physical entities, I do not believe that there are ghosts, I think that astrology is nonsense, etc. Hence, for me, this assertion is very practical and very useful.
So is Goedel's incompleteness theorem. For me, this theorem predicts that a fundamental mathematical theory cannot be perfect: while developing any such theory we will inevitably come either into contradictions, or into unsolvable problems. Is such a prediction practical? Some 30 years after Goedel's proof, in 1963 P.Cohen proved that if ZFC is a consistent theory, then this theory is not able to solve Cantor's continuum problem. If you prefer calling Goedel's theorem "very theoretical", then Cohen's result must be acknowledged as its "empirical confirmation". Since 1963 many classical problems of set theory were proved to be unsolvable, so we can speak about "massive empirical confirmation" of Goedel's theoretical prediction. Can we exclude that one day also some classical problems of number theory (for example, the twin prime conjecture) will be proved unsolvable?
Perhaps, the first steps in this direction were made before 1977 by Laurence Kirby, Jeff Paris and Leo Harrington. They proved that an extension of the so-called Finite Ramsey's theorem (a statement of discrete combinatorics that can be proved in set theory) cannot be proved in first order arithmetic (PA). Thus, for the first time, it was established that a relatively interesting assertion about natural numbers is unprovable when we are using the elementary (i.e. first order) notion of natural numbers. Still, using the extended "post-Cantor" (i.e. second order) notion of natural numbers we can prove this assertion.
Infinite Ramsey's Theorem
Frank P. Ramsey published this theorem in 1930:
F.P.Ramsey. On a problem of formal logic. "Proc. London Math. Soc.", 1930, vol.30, pp.264-285
Infinite Ramsey's theorem. Suppose, M is an infinite set, and e, r are positive integers. Imagine that each e-member subset of M is marked by one of r colors. Then there is an infinite subset of M such that all its e-member subsets are marked by the same color.
Proof (in ZFC, i.e. by using the axiom of choice, see Graham [1981]).
1. For e=1 the proof is obvious. Indeed, if M is an infinite set, and each member of it is marked by one of r colors, then at least one of the colors is assigned to an infinite subset of members. Q.E.D.
2. For e=2 the proof is straightforward. Here, M is an infinite set, and each pair of its members {a, b} is marked by one of r colors. Let us select a member b0 of M, and consider all pairs {b0, b} where b is in M-{b0}. There is a color c0 such that the set
M1 = { b in M | {b0, b} is marked by c0}
is infinite. As the next step, let us a select member b1 of M1, and consider all pairs {b1, b} where b is in M1-{b1}. There is a color c1 such that the set
M2 = { b in M1-{b1} | {b1, b} is marked by c1}
is infinite. Etc.
As the result, we obtain three infinite sequences:
- the sequence of members of M: b0, b1, b2, ...,
- the sequence of colors: c0, c1, c2, �,
- the sequence of subsets: M = M0 > M1 > M2 > ...,
where each bi is in Mi - Mi+1, and all pairs {bi, b} with b in Mi+1 are marked by the color ci.
One of the colors (let us denote it by c) occurs an infinite number of times in this sequence: c = ci for i = i0, i1, i2, .... The set of corresponding members:
H = { bi | i = i0, i1, i2, ...}
is an infinite subset of M, and all pairs {a, b} with a, b in H are marked by the same color c. Indeed, if a = bi and b = bj, where i = ik, j = im, and k<m, then a is in Mi - Mi+1, and b is in Mi+1, hence, {a, b} is marked by the color ci, i.e. by c. Q.E.D.
3) For e>=3 the proof is by induction, i.e. by using the following
Lemma. Suppose, M is an infinite set, and e, r are positive integers. Imagine that each e-member subset of M is marked by one of r colors. If the Infinite Ramsey's theorem is true for e-1, then for each infinite subset M' of M and each member b' of M' there is an infinite subset H' of M' - {b'} such that all e-member sets consisting of b' and e-1 members of H' are marked by the same color.
Proof of the Lemma. Let us derive from the "r-coloring" of e-member subsets of M the following "r-coloring" of (e-1)-member subsets of M' - {b'}:
Mark {x1, ..., xe-1} by the color c, iff {b', x1, ..., xe-1} is marked by the color c.
From the Infinite Ramsey's theorem for e-1 we obtain an infinite subset H' of M' - {b'} such that all its (e-1)-member subsets are marked by the same color. Now add b' to each of these subsets. Q.E.D.
Having this Lemma we can derive Ramsey's theorem for e from Ramsey's theorem for e-1 by repeating the argument we used for e=2. Indeed, our Lemma allows obtaining an infinite subset Mi+1 of Mi - {bi} such that all e-member subsets of Mi+1U{bi} containing bi are marked by the same color (denoted by ci). Q.E.D.
Our formulation and proof of the Infinite Ramsey's theorem belong to the set theory ZFC. The language of the first order arithmetic (PA) does not allow discussing arbitrary infinite sets, i.e. in PA we cannot even formulate this version of Ramsey's theorem.
Open problem? Can the Infinite (or, at least, the Countable) Ramsey's theorem be proved in ZF, i.e. without the (countable?) axiom of choice (or DC)? Maybe, Ramsey's theorem is itself a kind of "axiom of choice"?
Finite Ramsey's Theorem
The following finite version of Ramsey's theorem can be both formulated and proved in PA. Having an infinite set M we were searching for an infinite "single color" subset H. Now, dealing with finite sets M we are interested in the following question: how large must be the set M to have "single color" subsets with at least k members?
Let us denote by |M| the cardinality (i.e. number of members) of M.
Finite Ramsey's theorem. There is a computable function R(e, r, k) such that for all positive integers e, r, k and each finite set M the following holds: if |M|>=R(e, r, k), and each e-member subset of M is marked by one of r colors, then there is a subset H of M such that |H|=k, and all e-member subsets of H are marked by the same color.
Proof (in PA). 1) For r=1 the proof is obvious: we can take R(e, 1, k) = k.
2) Now let us consider the case r=2, when e-member subsets of M are marked by two colors. Surprisingly, the following generalization of the theorem is easier to prove: there is a computable function R'(e, r, k1, k2) such that if |M|>=R'(e, 2, k1, k2), then there is either a subset H1 of M such that |H1|=k1, and all e-member subsets of H1 are marked by the first color, or a subset H2 such that |H2|=k2, and all e-member subsets of H2 are marked by the second color.
The proof is by induction from e-1, (e, k1-1, k2), and (e, k1, k2 -1) to (e, k1, k2).
Induction base. For e=1 we can take R'(1, 2, k1, k2) = k1+k2. Indeed, in this case the-members of M themselves are marked by using two colors.
For the minimum k1, i.e. k1=e we can take R'(e, 2, e, k2) = k2 (where k2>=e). Indeed, if |M|>=k2, and there is an e-member subset x marked by the first color, then we can take H1 = x. If there are no such subsets, then all e-member subsets of M are marked by the second color, and we can take H2 = M.
For the minimum k2, i.e. k2=e we can take R'(e, 2, k1, e) = k1. The proof is identical.
Induction step. Let k1, k2 >= e. Let us show that we can take
R'(e, 2, k1, k2) = 1+R'(e-1, 2, R'(e, 2, k1-1, k2), R'(e, 2, k1, k2-1)).
Indeed, suppose |M| >= R'(e, 2, k1, k2), select a member b of M, and consider all e-member subsets of M that contain b: {b, x1, ..., xe-1}. Each of these subsets is marked either by the first, or by the second color. Let us define the following "2-coloring" of (e-1)-member subsets of M -{b} (i = 1, 2):
{x1, ..., xe-1} is marked by i-th color, iff {b, x1, ..., xe-1} is marked by i-th color.
Since |M - {b}| >= R'(e-1, 2, T1, T2), where T1 = R'(e, 2, k1-1, k2), and T2 = R'(e, 2, k1, k2-1), then by our modified theorem for e-1 we obtain a subset M' of M - {b} such that:
a) Either |M'| = T1 and all (e-1)-member subsets of M' are marked by the first color.
b) Or |M'| = T2 and all (e-1)-member subsets of M' are marked by the second color.
In the case a), for all subsets {x1, ..., xe-1} of M' the e-member set {b, x1, ..., xe-1} is marked by the first color. Since |M'| = T1 = R'(e, 2, k1-1, k2), by our modified theorem for (e, k1-1, k2) we obtain a subset H' of M' such that:
a1) Either |H'| = k1-1 and all e-member subsets of H' are marked by the first color. Then for the case (e, k1, k2) we can take H = H'U{b}.
a2) Or |H'| = k2 and all e-member subsets of H' are marked by the second color. Then for the case (e, k1, k2) we can take H = H'.
The proof for the case b) is similar.
To complete the case r=2 of the Finite Ramsey's theorem we can take
R(e, 2, k) = R'(e, 2, k, k).
Q.E.D. for r=2.
3) The case r>2 we will prove by induction, namely by showing that we can take
R(e, r, k) = R(e, 2, R(e, r-1, k)).
Indeed, assume that |M| >= R(e, r, k) and all e-member subsets of M are marked by using r colors. To reduce the situation to the case r=2 let us "merge" the second and all the following colors (i.e. except the first one). Then, by the case r=2 we obtain a subset M' of M such that |M'| = R(e, r-1, k) and:
a) Either all e-member subsets of M' are marked by the first color.
b) Or all e-member subsets of M' are marked by the second (i.e. "merged") color.
In the case a), since R(e, r-1, k) >=k, we obtain immediately a subset H of M' such that |H| = k and all e-member subsets of H are marked by the first color. Q.E.D.
In the case b) we have an "(r-1)-coloring" of all e-member subsets of M'. Since |M'| = R(e, r-1, k), we have the case r-1 of the Finite Ramsey's theorem that is supposed to be true, i.e. there is a subset H of M' such that |H| = k and all e-member subsets of H are marked by the same color. Q.E.D.
Q.E.D. for the entire Finite Ramsey's theorem.
It may seem that the Finite Ramsey's theorem is discussing arbitrary finite sets, not natural numbers. Still, since this theorem does not involve properties of members of finite sets, we can simply replace these-members by natural numbers. Each finite set of natural numbers {n1, ..., nk} we can represent by two numbers (b, c) (we could use, for example, Goedel's beta-function, see Section 3.3):
beta(b, c, 0) = k, beta(b, c, 1) = n1, ..., beta(b, c, k) = nk.
In this way the Finite Ramsey's theorem can be formulated in the language of PA, and our very elementary proof of it - converted into a formal proof in PA.
Extended Finite Ramsey's Theorem
Now we have two extreme versions of Ramsey's theorem:
a) The infinite version that can be neither formulated, nor proved in PA, yet it can be both formulated and proved in ZFC.
b) The finite version that can be both formulated and proved in PA.
In 1977 an intermediate version of Ramsey's theorem was discovered (= invented) that can be formulated in PA, and can be proved in ZFC, yet not in PA (if PA is consistent).
L.Kirby, J.Paris. Initial segments of models of Peano's axioms. "Proceedings of the Bierutowice Conference 1976", Springer, Berlin, 1979
J.Paris. Independence results for Peano arithmetic. "J. Symbolic Logic", 1978, vol.43, N4, pp.725-731
J.Paris, L.Harrington. A mathematical incompleteness in Peano arithmetic. In Barwise [1977].
The authors are telling their story in the third of these papers:
"The first examples of strictly mathematical statements about natural numbers which are true but not provable in PA (Peano arithmetic) were due to the first author (see Paris [to appear], and grew out of the work in Paris and Kirby [to appear]. The second author's contribution was to show that Paris's proof could be carried through with the particularly simple extension of the Finite Ramsey Theorem..."
If the finite sets discussed in the Finite Ramsey's theorem would come from some fixed countable "universe" (for example, if we decided to consider only sets of natural numbers), then we could not restrict ourselves to counting of members of these sets. And we could consider also some other properties of them.
For example, let us call a property g of finite sets of some "universe" U a dense property, iff:
a) If a finite set H1 possess the property g, and H1 is a subset of a finite set H2, then H2 also possess the property g.
b) For each infinite set H2 there is a finite subset H1 that possess the property g (this is the "density" of g).
A simple example of a dense property of sets of natural numbers is the so-called property of being "relatively large": a set H of natural numbers is called relatively large, iff min(H) <= |H|.
Indeed:
a) If min(H1)<=|H1|, and H1 is a subset of H2, then min(H2)<=min(H1)<= |H1|<=|H2|, i.e. min(H2)<=|H2|.
b) If H2 is an infinite set of natural numbers, take as H1 the set of min(H2) least members of H2. Then min(H1)=min(H2)=|H1|, i.e. H1 is relatively large.
This is an example of a computable dense property: having all members of some finite set, we can effectively decide, possess this set the property or not.
Extended Finite Ramsey's theorem. Let us consider only sets of natural numbers. For each computable dense property g there is a computable function Rg(e, r, k) such that for all positive integers e, r, k and each finite set M the following holds: if |M|>=Rg(e, r, k), and each e-member subset of M is marked by one of r colors, then there is a subset H of M such that |H|>=k, H possess the property g, and all e-member subsets of H are marked by the same color.
This theorem differs from the Finite Ramsey's theorem only "a little" - additionally, the set H possess the property g.
Proof - part1 (in PA). Since properties of members of the set M do not affect the problem to be solved, we can replace M, for example, by the set of natural numbers {0, 1, ..., |M|-1}, i.e. in terms of Section 2.3 we can say that M is a natural number.
For M=e there is only one e-member subset {0, 1, ..., e-1}. The number of possible "r-colorings" of this subset is r.
Suppose, we have a pair (M, C), where C is some r-coloring of e-member subsets of M. Of course, for each triple (e, r, M) there is only a finite number of different r-colorings of e-member subsets of M. Hence, if we proceed from M to M+1, where
M+1 = {0, 1, ..., M-1, M} = M U {M},
then from the r-coloring C we can obtain only a finite number of r-colorings (of e-member subsets) of M+1 that are extensions of C. (Some coloring C' of M+1 is called an extension of C, iff each e-member subset of M is marked in C' by the same color as it is marked in C.)
|-------(e, P0')------ ...
|
|-------(e, P0'')------ ...
|
|------- ...
|
|-------(e, P0(i))------ ... --------(M, P)-------|-------(M+1, P(j)) -----------
|
|------- ...
|
|-------(e, P0(r))------ ...
If we proceed in this way from e to e+1, after this - to e+2, e+3 etc., then we obtain an infinite tree of pairs (M, C) having at each of its nodes only a finite number of branches. Indeed, let us start from a fictive empty node O. At the next level (M=e) we have r branches to r different r-colorings of the only e-member subset of e. Etc., from each node (M, C), where C is an r-coloring of e-member subsets of M, a finite number of branches is starting to all the possible nodes (M+1. C') such that C' is an r-coloring of e-member subsets of M+1 that extends C.
Of course, (for fixed e and r) this tree contains all the possible pairs (M, C), where C is an r-coloring of e-member subsets of M, and it defines some natural ordering of them.
Let us say that (M, C) is a "good" node, if there is a subset H of M such that |H| >=k, H possess the property g, and all e-member subsets of H are marked (in the coloring C) by the same color.
Exercise A.2.1. Verify that if (M, C) is "good", and there is a branch from (M, C) to (M+1, P'), then (M+1, P') also is "good". I.e. if some node is "good", then then the entire subtree of it is "good".
Since each level of the tree contains only a finite number of nodes, the Extended Finite Ramsay's theorem would be proved, if we could prove that there is only a finite number of "bad" nodes. Indeed, then we could produce the following algorithm computing the function Rg(e, r, k). Let us scan all levels of the tree one by one, determining for each node, is it "good" or "bad". If there is only a finite number of "bad" nodes, then at some level all nodes will be "good". Let us take the level number M for the value of Rg(e, r, k).
Exercise A.2.2. Describe an algorithm determining, is a given tree node "good" or "bad". How much time is required to solve this task?
Of course, we do not need set theory to define the above algorithm. Indeed, let us repeat its definition once more:
Input: numbers e, r, k. Build the corresponding (M, C)-tree. Scan all levels of this tree one by one, determining for each node, is it "good" or "bad". If at some level all nodes are "good", take the level number M and output it as the value Rg(e, r, k).
Hence, no problem to write a computer program that takes numbers e, r, k as input, and either calculates the number Rg(e, r, k) as output, or ... does not halt (if there are no tree levels with "good" nodes only).
Surprisingly, we need set theory to prove that this program halts for all triples (e, r, k).
Proof - part2 (in ZFC). Let us assume the opposite - that there is an infinite number of "bad" nodes. If there is a branch from (M, C) to (M+1, C'), and the node (M+1, C') is "bad", then (M, C) also is "bad". Hence, the substructure of "bad" nodes in our tree is itself a tree - a finitely branching infinite tree.
Exercise A.2.3. Prove the following version of the so-called Koenig's lemma: if a finitely branching tree has infinite set of nodes, then this tree contains an infinite branch.
Hence, our tree contains an infinite branch B consisting of "bad" nodes only. This branch defines a single r-coloring C'' of all e-member sets of natural numbers. Indeed, if {x1, ..., xe} is a set of natural numbers, then take M = max{x1, ..., xe}, consider the node (M, CM) of the branch B, and mark the set {x1, ..., xe} (in the coloring C'') by the color it is marked in the coloring CM. This definition of C'' is "stable" in the sense that on the branch B the coloring CM is the first one that assigns a color to the set {x1, ..., xe}, and all the following colorings CM+1, CM+2, ... cannot change this color, since they all are extensions of CM. I.e. C'' is an extension of CM for all M.
Let us apply the Infinite Ramsey's theorem to the set N of all natural numbers and the r-coloring C''. I.e. there is an infinite subset H'' of N such that all e-member subsets of H'' are marked by the same color. Since g is a dense property, there is a finite subset H of H'' that possess the property g. Let us add to H enough members of H'' to ensure that |H| >= k. This extended set also possess the property g. If we take M = max(H)+1, then H appears to be a subset of M such that: |H| >= k, H possess the property g, and all e-member subsets of H are marked by the same color in the coloring CM. Hence, (M, PM) is a "good" node - on the branch B consisting of "bad" nodes only!
I.e. our tree always contains only a finite number of "bad" nodes. And hence, our algorithm computing the function Rg(e, r, k) halts for all triples (e, r, k). Q.E.D.
Extended Finite Ramsey's theorem cannot be proved in PA
For any computable dense property g (such as min(H) <= |H|) the Extended Finite Ramsey's theorem can be formulated in PA. We have proved this theorem in ZFC, yet it cannot be proved in PA (if PA is consistent) even for any single property g.
Proof. See the above paper by Paris and Harrington.
Thus, since 1977 we know an example of a "strictly mathematical statement about natural numbers" that cannot be proved in first order arithmetic. And since 1977, some similar results were established: see Natural Independence Phenomenon in Eric Weisstein's World of Mathematics, and Goodstein's Amazing Sequence.
All this means that Greeks having only their first order notion of natural numbers could not prove the Extended Finite Ramsey's theorem and some other "strictly mathematical statements about natural numbers". These proofs became possible only in 1870s when Georg Cantor invented set theory. By introducing the notion of arbitrary infinite sets Cantor added new features also to the 2400 years old notion of natural numbers. Q.E.D.
Now let us return to the beginning of this Appendix where the problem of introducing "nonmathematical" questions into number theory was discussed. If you believe that formulas of first order arithmetic (PA) used in Goedel's proofs are not normal mathematical statements about natural numbers, then what would you say about the following theorem from the same famous paper by Paris and Harrington?
Traditionally, the so-called SIGMA1-formulas are defined as formulas of PA having the form Ex1...ExnF(...), where F belongs to the class of the so-called "primitive recursive" formulas, and all quantifiers before F are existential. Still, as we have proved in Section 4, any such formula has a Diophantine representation
Ex1..ExnEy1...Eyk P = 0
where P=0 is a Diophantine equation. Since our proof can be formalized in PA, we can define SIGMA1-formulas simply as Diophantine representations, i.e. as Diophantine equations preceeded by existential quantifiers.
The following statement can be formulated in the language of PA:
"For all SIGMA1-formulas F(x) having exactly one free variable x, if for each n PA proves F(n), then AxF(x)"
This statement is called the uniform SIGMA1 reflection principle for PA. This principle says that if PA proves all cases of some SIGMA1-formula F(x), then F(x) is true for all x, i.e. it asserts the soundness of PA in the area of SIGMA1-formulas. Of course, this principle can be proved in ZF by using the standard model of PA (see Appendix 1). Still, it cannot be proved in PA, moreover, it cannot be proved even in PA + Con(PA) (if this extended theory is consistent, see the chapter about incompleteness theorems in Barwise [1977]). Hence, the uniform SIGMA1 reflection principle for PA is a stronger hypothesis than the hypothesis "PA is consistent".
Is the uniform SIGMA1 reflection principle for PA (as a formula of first order arithmetic) an example of introducing "nonmathematical" questions into number theory? Of course, it is. Is the Extended Finite Ramsey's theorem an example of a "strictly mathematical statement about natural numbers"? Of course, it is. Still, one can prove the following
Theorem. It can be proved in PA, that the Extended Finite Ramsey's theorem (for any fixed property g) is equivalent to the uniform SIGMA1 reflection principle for PA!
Proof. See the above paper by Paris and Harrington.
A deadlock? Not for me. I find more interesting the conclusion that Extended Finite Ramsey's theorem cannot be proved not only in PA, but it cannot be proved also in PA+Con(PA).
The function R(e, r, k) from the Finite Ramsey's theorem is known as a very fast growing function (see Graham [1981]). Still, Rg(e, r, k) exceeds in this area any possible expectations. Namely, for any fixed dense property g the "diagonal" function Rg(k, k, k+1) is growing faster than any function f(k) such that
PA proves: AxEyF(x, y),
where formula F(x, y) represents f in PA (see Section 3.3). I.e. if you can prove in PA, that your algorithm for computing f(k) halts for all k, then f(k) as a function of k is growing slower than Rg(k, k, k+1).
Proof. See the above paper by Paris and Harrington.
model theory, Skolem paradox, Ramsey theorem, Loewenheim, categorical, Ramsey, Skolem, G�del, completeness theorem, categoricity, Goedel, theorem, completeness, Godel
Back to title page.