resolution principle, Robinson, Herbrand theorem, normal form, clausal, clause, form, resolution method, normal, clausal form, clause form, resolution, Herbrand, theorem, unification, mgu, unifier, most general unifier
Left |
Adjust your browser window |
Right |
Clause Forms of Propositional Formulas
Which form is more "natural" - DNF, or CNF? Of course, CNF is more natural. Indeed, a DNF D1vD2v ... vDm asserts that one (or more) of the formulas Di is true. This is a very complicated assertion - sometimes D1 is true, sometimes D2 is true, etc. But, if we have a CNF instead - C1&C2& ... &Cn? It asserts that all the formulas Ci are true, i.e. we can replace the long formula C1&C2& ... &Cn by a set of shorter formulas C1, C2, ..., Cn. For human reading and for computer processing, a set of shorter formulas is much more convenient than a single long formula.
Let us return to our example formula ((A->B)->C)->(C->B) of Section 5.2, for which we obtained a DNF
(~A&~C)v(B&~C)v(~C)v(B).
and a CNF:
(~AvBv~C)&(Bv~C).
Without a transformation, the above DNF is hard for reading, understanding and analyzing. The CNF is more convenient - it says simply that ~AvBv~C is true and Bv~C is true.
As another step, making the formulas easier to understand, we could apply the following equivalencies:
[L1-L11, MP]: |- ~AvB <->A->B,
[L1-L11, MP]: |- ~Av~BvC <-> A&B->C,
[L1-L11, MP]: |- ~AvBvC <-> A->BvC,
[L1-L11, MP]: |- ~Av~BvCvD <-> A&B->CvD,
etc.
Exercise 5.4.1. Verify these equivalencies by proving that, generally,
[L1-L11, MP]: ~A1v~A2v ... v~AmvB1vB2v ... vBn <-> (A1&A2& ... &Am -> B1vB2v ... vBn).
Thus, we can replace our set of two formulas ~AvBv~C and Bv~C by
A&C->B,
C->B.
The conjunction of these two formulas is equivalent to the initial formula ((A->B)->C)->(C->B).
Note. Of course, in this particular case (not in general!), we could drop the first clause (because it is a consequence of the second one).
Formulas having the form
A1&A2& ... &Am -> B1vB2v ... vBn,
or, alternatively,
~A1v~A2v ... v~AmvB1vB2v ... vBn,
where A1, A2, ... , Am, B1, B2, ... , Bn are atoms, are called clauses. Clauses are very well suited for computer processing. Indeed, in the computer memory, we can represent the formula A1&A2& ... &Am -> B1vB2v ... vBn simply as a pair of sets of atoms - negative {A1, A2, ... , Am} and positive {B1, B2, ... , Bn}.
What, if one (or both) of these sets is (are) empty?
If, in the formula ~A1v~A2v ... v~AmvB1vB2v ... vBn, we have m = 0 and n > 0, then, of course, this formula asserts simply that B1vB2v ... vBn, i.e. "converting" it into the implication -> B1vB2v ... vBn (with empty premise) leads us to the following definition: the clause -> B1vB2v ... vBn means the same as B1vB2v ... vBn.
If, in the formula ~A1v~A2v ... v~AmvB1vB2v ... vBn, we have m > 0 and n = 0, then, of course, this formula asserts simply that~A1v~A2v ... v~Am, i.e. "converting" it into the implication A1&A2& ... &Am -> (with empty consequence) leads us to the following definition: the clause A1&A2& ... &Am -> means the same as ~(A1&A2& ... &Am).
If, in the formula ~A1v~A2v ... v~AmvB1vB2v ... vBn, we have both m = 0 and n = 0, then we have an empty disjunction, i.e. an empty clause means "false", i.e. is equivalent to A&~A.
Note. Clauses are similar to sequents - pairs of sets of formulas (S1, S2), used in the proof of Theorem 4.4.5 (completeness of the constructive propositional logic) in Section 4.4. In a sequent (S1, S2), the sets S1, S2 could contain arbitrary formulas, but, in a clause, S1, S2 are sets of atoms.
Sets (i.e. conjunctions) of clauses are called clause forms (in some texts - clausal forms). By Theorem 5.2.1, every propositional formula can be reduced to a (possibly empty) CNF. Since every CNF can be converted into a clause form, we have established the following
Theorem 5.4.1. In the classical logic, every propositional formula can be reduced to a clause form. More precisely, assume, the formula F has been built of formulas B1, B2, ..., Bn by using propositional connectives only. Then there is a (possibly empty) clause form over B1, B2, ..., Bn such that in [L1-L11, MP] the formula F is equivalent to the conjunction of the clauses contained in the form.
Exercise 5.4.2. Obtain clause forms of the formulas mentioned in the Exercise 5.2.2.
Clause forms are very well suited for computer processing. In the computer memory, every clause form can be represented as "a set of pairs of sets of atoms" - i.e. it means less character string processing and less expression parsing!
Clause Forms of Predicate Formulas
Of course (unfortunately), nothing comparable to clause forms could be obtained for predicate formulas, if we insisted that the clause form must be equivalent to the initial formula. Still, reducing of predicate formulas to "clause forms" becomes possible, if we drop this requirement, and replace it by the requirement that the "clause form" must be satisfiable, iff the initial formula is satisfiable. And - if we allow extending the language by adding new constant letters and new function letters.
Then, by Skolem's Theorem (Theorem 5.3.1), for each closed formula F, we can obtain a Skolem normal form Ax1Ax2...Axk G, where k>=0, the formula G does not contain quantifiers, and this form is satisfiable, iff so is F. By Theorem 5.4.1, let us convert G into a clause form G' (with atomic sub-formulas of G playing the role of atoms B1, B2, ..., Bn). Since G' is equivalent to G, the formula Ax1Ax2...Axk G' is satisfiable, iff so is F.
The set of clauses G' is called clause form of the formula F. For predicate formulas, clauses consist of atomic formulas, i.e. the formulas having the form p(t1, ..., tm), where p is a predicate letter, and t1, ..., tm are terms (possibly, containing variables).
Thus, we have proved the following
Theorem 5.4.2. Let L be a first order language. There is an algorithm allowing to construct, for each closed formula F of this language, a finite set S of clauses (in a language L' obtained from L by adding a finite set of new constant letters and new function letters - depending on F) such that F is satisfiable, iff is satisfiable the (closed) formula Ax1Ax2...Axn &S, where &S is the conjunction of the clauses contained in S, and x1, x2, ..., xn are all the variables appearing in the clauses.
Note. In most texts, the closed formula Ax1Ax2...Axn &S (i.e. where all the variables appearing in &S are universally quantified) is called the universal closure of &S. Thus, we could define clause forms of predicate formulas as universal closures of sets of clauses.
As an example, let us consider the formula asserting that there are infinitely many prime numbers:
"x is a prime number": x>1 & ~EyEz (y>1 & z>1 & x=y*z),
AuEx (x>u & x is a prime number),
AuEx (x>u & x>1 & ~EyEz (y>1 & z>1 & x=y*z)) --------------- (*)
Convert it into a prenex normal form:
AuEx (x>u & x>1 & AyAz ~(y>1 & z>1 & x=y*z)),
AuExAyAz (x>u & x>1 & ~(y>1 & z>1 & x=y*z)).
Replace AuEx by Au - introduce a Skolem function g:
AuAyAz (g(u)>u & g(u)>1 & ~(y>1 & z>1 & g(u)=y*z)).
In this Skolem normal form, convert the quantifier free part into a conjunctive normal form:
AuAyAz (g(u)>u & g(u)>1 & (~(y>1) v ~(z>1) v ~(g(u)=y*z))).
This formula is satisfiable; iff so is the initial formula (*).
Thus, we have obtained a set of 3 clauses:
g(u)>u,
g(u)>1,
~(y>1) v ~(z>1) v ~(g(u)=y*z).
or, alternatively,
-> g(u)>u,
-> g(u)>1,
y>1, z>1, g(u)=y*z ->.
This set of 3 formulas is the clause form of the formula (*).
Exercise 5.4.3. Obtain clause forms of the formulas mentioned in the Exercise 5.1.4 (assume that B, C, D, F are predicate letters).
Horn Clauses
Alfred Horn (1918-2001) - biography (in Bulgarian) published by Dimiter Skordev.
http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?Horn+clause
The name "Horn Clause" comes from the logician Alfred Horn, who first pointed out the significance of such clauses in 1951, in the article "On sentences which are true of direct unions of algebras", Journal of Symbolic Logic, 16, 14-21.
http://www.cs.ucsd.edu/users/goguen/courses/230/s6.html
As a footnote, Alfred Horn, for whom Horn clauses are named, had nothing to do with logic programming; he was a professor of logic at UCLA who in 1951 wrote paper using the sentences that now bear his name for reasons having little to do with computer science. As a second footnote, it seems to me rather misleading to call Prolog a "logic programming" language, since it departs rather far from logic; I would rather have had it called a "relational programming" language, because it is the use and manipulation of relations that is most characteristic of its programming style.
http://www.cs.fit.edu/~ryan/study/bibliography.html
Horn, Alfred. ``On sentences which are true of direct unions of algebras.'' Journal of Symbolic Logic, volume 16, number 1, March 1951, pages 14-21.
This paper has very little to do with Horn clauses.
To be continued.
History
J.A.Robinson. Theorem-proving on the computer. "Jour. Assoc. Comput. Mach.", vol.10, N2, 1963, pp.163-174
J.A.Robinson. A machine-oriented logic based on the resolution principle, "Jour. Assoc. Comput. Mach.", vol.12, N1, January 1965, pp.23-41 (Russian translation available: "Kib. sbornik (novaya seriya)", 7, 1970, pp.194-218)
John Alan Robinson: "Born in Yorkshire in 1930, Robinson came to the United States in 1952 with a classics degree from Cambridge University. He studied philosophy at the University of Oregon before moving to Princeton where he received his PhD in philosophy in 1956. Temporarily ``disillusioned with philosophy,`` he went to work as an operations research analyst for Du Pont, where he learnt programming and taught himself mathematics. Robinson moved to Rice University in 1961, spending his summers as a visiting researcher at the Argonne National Laboratory's Applied Mathematics Division. Its then Director, William F. Miller, pointed Robinson in the direction of theorem proving...
Miller showed Robinson a 1960 paper by Martin Davis and Hilary Putnam (coincidentally, the latter had been Robinson's PhD supervisor) proposing a predicate-calculus proof procedure that seemed potentially superior to Gilmore's, but which they had not yet turned into a practical computer program. Miller suggested that Robinson use his programming skills to implement Davis and Putnam's procedure on the Argonne IBM 704. Robinson quickly found that their procedure remained very inefficient. However, while implementing a different procedure also suggested in 1960 by Dag Prawitz, Robinson came to see how the two sets of ideas could be combined into a new, far more efficient, automated proof procedure for first-order predicate logic: "resolution"..." (According to Donald MacKenzie, The Automation of Proof: A Historical and Sociological Exploration, "IEEE Annals of the History of Computing", vol.17, N3, 1995, pp. 7-29, see also http://dream.dai.ed.ac.uk/papers/donald/donald.html).
"In retrospect, unification and resolution seem rather obvious ideas, which arise inevitably when one asks what must be syntactically true of a set of clauses which possesses the semantic property of having no Herbrand models." (J.A.Robinson, "Unification and Resolution in Retrospect", 1997, see at http://www.univ-orleans.fr/SCIENCES/LIFO/Manifestations/Jfplc_Unif_97/jfplc/invite-francais.html).
Note. Almost at the same time when J.A.Robinson invented the resolution method, Sergei Maslov invented his inverse method, which has a similar range of applications:
S.Yu.Maslov. An inverse method of establishing deducibilities in the classical predicate calculus, "Soviet Mathematics, Doklady", 1964, N5, pp.1420-1423.
About the early history of the problem see
M.Davis.
The Early History of Automated Deduction.In: Handbook
of Automated Reasoning, ed. by A.Robinson and A.Voronkov,
Elsevier Science, 2001, vol. I, pp. 3-15 (online postscript)
Method
Assume that, in a set of clauses, two clauses are contained such that an atom C appears as a positive member in the first clause, and as a negative member in the second one:
~A1v~A2v ... v~AmvB1vB2v ... vBnvC, ------------------ (1)
~Cv~D1v~D2v ... v~DpvE1vE2v ... vEq, ------------------ (2)
or, simply,
FvC, ------- (1a)
~CvG. --------- (2a)
If C is false, then (1a) yields F, and, if C is true, then (2a) yields G. Thus, from (1a) and (2a) we have derived FvG. I.e. deriving FvG from FvC and ~CvG is "logically correct", and it is called Robinson's resolution rule (J.A.Robinson proposed it in the above 1963 paper):
FvC, ~CvG
----------------
FvG
If F is empty, then from C, ~CvG (i.e. C, C->G), the resolution rule derives G, i.e. it includes Modus Ponens as a special case.
If G is empty, then from ~FvC, ~C (i.e. F->C, ~C), the resolution rule derives ~F, i.e. it includes Modus Tollens as a special case.
Exercise 5.5.1. Derive this inference rule in the constructive logic, i.e. prove that [L1-L10, MP]: CvF, ~CvG |- FvG. Verify that it cannot be proved in the minimal logic [L1-L9, MP]. (Hint: in the positive part - use Theorem 2.5.1(b) [L1, L2, L8, L10, MP]: FvC, ~C |- F. In the negative part - note that in Section 2.5 we established that, in the minimal logic, the rule FvC, ~C |- F allows proving of L10).
Thus, from the clauses (1) and (2), Robinson's resolution rule allows deriving of the following clause:
~A1v~A2v ... v~Amv~D1v~D2v ... v~Dp v B1vB2v ... vBnvE1vE2v ... vEq. ------------ (3)
At first glance, this approach leads to nothing, because (3) is "much longer" than (1), and than (2). Still, this is not 100% true, because, additionally, we can reduce the repeating atoms in ~A1v~A2v ... v~Amv~D1v~D2v ... v~Dp and in B1vB2v ... vBnvE1vE2v ... vEq, and, finally, the set of atoms, used in a clause form, is fixed!. And the atom C does not appear in (3) at all! Thus, of course, repeated application of the resolution rule cannot lead to an infinite growth of formulas.
The smart idea behind Robinson's resolution rule is: it is a universal tool for deriving contradictions from inconsistent sets of clauses! More precisely, it is universal, if used together with the following trivial inference rules:
FvCvDvG
----------------------------------- (Permutation),
FvDvCvG
FvCvCvG
---------------------------------- (Reduction).
FvCvG
The permutation rule allows arbitrary reordering of atoms in a clause (for example, moving C to right, and moving ~C to left). The reduction rule allows reduction of repeating identical atoms.
Exercise 5.5.2. Derive these inference rule in the minimal logic [L1-L9, MP].
Theorem 5.5.1 ( J.A.Robinson). In the propositional classical logic [L1-L11, MP], a finite set of propositional clauses is inconsistent, iff Robinson's resolution rule (together with permutation and reduction rules) allows deriving a contradiction from it.
Note. In some other texts, this fact is called "the refutation-completeness of the resolution rule" for the propositional logic..
Proof. 1. As you have proved in the Exercises 5.5.1 and 5.5.2, all the formulas, derived from a set of formulas K1, K2, ... , Ks by using the permutation, resolution and reduction rules are consequences of K1, K2, ... , Ks. Hence, if these rules allow deriving a contradiction from this set of formulas, then it (the set) is inconsistent.
2. Now, let us assume that a set of propositional clauses K1, K2, ... , Ks is inconsistent, i.e. a contradiction A&~A can be derived from it:
[L1-L11, MP]: K1, K2, ... , Ks |- A&~A.
Then, under the classical truth tables, the conjunction K1&K2& ... &Ks takes only false values (verify!). Let us mark one of the atoms (the atom C) in it. Let us denote:
- By CvFi - the clauses containing C without negation,
- By ~CvGj - the clauses containing C with negation,
- By Hk - the clauses that do not contain C.
All the formulas Fi, Gj, Hk are disjunctions of atoms (with or without negations) that do not contain the atom C.
Thus K1&K2& ... &Ks is equivalent to
&(CvFi ) & &(~CvGj) & &Hk. ---------------------- (4)
Let us apply (the strange) one of the distribution rules (Theorem 2.3.1): [L1-L8, MP] |- (A&B)vC<->(AvC)&(BvC). Hence, K1&K2& ... &Ks is equivalent to
(C v &Fi ) & (~C v &Gj) & &Hk.
If C is false, then this formula is equivalent to &Fi & &Hk, i.e. &Fi & &Hk takes only false values. If C is true, then it is equivalent to &Gi & &Hk, i.e. &Gi & &Hk takes only false values. Thus the disjunction
(&Fi & &Hk) v (&Gj & &Hk) ---------------------- (5)
also takes only false values. Now, let us, apply (the "normal") one of the distribution rules (Theorem 2.3.1): [L1-L8, MP] |- (AvB)&C<->(A&C)v(B&C), obtaining that (5) is equivalent to
(&Fi v &Gj) & &Hk. -------------------- (6)
I.e. this formula also takes only false values. And - important note! - it does not contain the atom C.
Finally, by applying, again, (the strange) one of the distribution rules (Theorem 2.3.1): [L1-L8, MP] |- (A&B)vC<->(AvC)&(BvC), we can conclude that (6) is equivalent to &&(Fi v Gj) & &Hk, i.e. to the set of clauses Fi v Gj and Hk (where i, j, k run over their initial ranges).
What does this achievement mean? If the set of propositional clauses K1, K2, ... , Ks is inconsistent, then there is a set of clauses Fi v Gj and Hk (where i, j, k run over their initial ranges), which is inconsistent as well, but which contains one atom less than K1, K2, ... , Ks.
Now, imagine, that, in the clause form (4), we have applied the resolution rule for the atom C in all the possible ways (before applying, apply the permutation rule to reorder atoms moving C to right, and ~C - to left):
FivC, ~CvGj
------------------
FivGj
After this, apply the permutation and reduction rules to reduce identical atoms. In this way we have obtained exactly the above-mentioned inconsistent set of clauses Fi v Gj and Hk (where i, j, k run over their initial ranges).
Thus, if some set of propositional formulas K1, K2, ... , Ks is inconsistent, then the resolution rule (togeher with the permutation and reduction rules) allows to derive from it another inconsistent set of propositional formulas, which contains one atom less.
By iterating this process, at the end of it, we will have an inconsistent set of propositional formulas built of a single atom B. In a clause form, there can be only one such set - the set B, ~B. This set represents a contradiction.
Q.E.D.
As an example, let use Robinson's resolution rule to prove that
BvC, C->B, B->D |- B&D.
Let us add ~(B&D) to premises BvC, C->B, B->D. We must prove that this set of 4 formulas is inconsistent. First, let us obtain clause forms:
BvC in clause form is BvC,
C->B in clause form is ~CvB,
B->D in clause form is ~BvD,
~(B&D) is equivalent to ~Bv~D.
Now, let us apply resolution to derive a contradiction from this set of 4 clauses: BvC, ~CvB, ~BvD, ~Bv~D:
From BvC, ~CvB we derive B, and have now 5 clauses: BvC, ~CvB, ~BvD, ~Bv~D, B.
From ~BvD, ~Bv~D we derive ~B, and have now 6 clauses: BvC, ~CvB, ~BvD, ~Bv~D, B, ~B.
Thus, we have derived a contradiction: B, ~B. Q.E.D.
Exercise 5.5.3. Use the resolution rule to prove the following:
a) A->B, ~A->B |- B.
b) (A->B)->A |- A (Peirce's Law).
c) B->(C->D), B->C |- B->D (Axiom L2).
d) B->D, C->D |- BvC->D. (Axiom L8).
e) AvBvC, B->AvC, A->C |- C.
From a Programmer's Point of View
Of course, when implementing the resolution rule in a computer program, we do not need decorations like the permutation and reduction rules. In a program, we will represent each clause ~A1v~A2v ... v~AmvB1vB2v ... vBn as a pair of sets: negative atoms, N = {A1, A2, ... , Am}, and positive atoms, P = {B1, B2, ... , Bn}. Of course, the sets N, P do not intersect (if they do, then the clause contains ~CvCv..., i.e. it is a tautology, and can be dropped as "non-informative").
Resolution rule (non-refined version). If there are two clauses N1, P1 and N2, P2 such that P1 and N2 (or N1 and P2) contain a common atom C, then we can derive the clause N1UN2-{C}, P1UP2-{C}.
Of course, the set union operation includes reduction of identical members automatically.
The condition "P1 and N2 (or N1 and P2) contain a common atom C" can be expressed as "C in (Ni^Pj)U(Pi^Nj)", where ^ means set intersection.
If, in the resulting clause, the sets N1UN2-{C}, P1UP2-{C} intersect, then we should ignore such result. Fortunately, this can be detected in advance. Indeed,
(N1UN2)^(P1UP2) = (N1^P1)U(N1^P2)U(N2^P1)U(N2^P2) = (N1^P2)U(N2^P1),
because N1^P1 and N2^P2 are empty sets. The set (N1^P2)U(N2^P1) is exactly the set of all atoms C allowing application of the resolution rule to clauses N1, P1 and N2, P2. Hence, the sets N1UN2-{C}, P1UP2-{C}will not intersect, iff the set (N1^P2)U(N2^P1) contains exactly one atom C, i.e., iff there is exactly one atom allowing application of the resolution rule.
Resolution rule (refined version). If there are two clauses N1, P1 and N2, P2 such that the set (N1^P2)U(N2^P1) contains exactly one atom C, then we can derive the clause N1UN2-{C}, P1UP2-{C}.
Now, let us try to design a program implementing the last step of "proving by resolution" - suppose, we have already the initial list of clauses, and we wish to apply the resolution rule trying to derive a contradiction.
The main data storage will be a growing list of clauses (the main list):
(N1, P1), (N2, P2), ..., (Nk, Pk), ...
It will start as the initial list, and each application of the resolution rule will append a new clause to it.
To guarantee a success, we must apply the resolution rule in all the possible ways, i.e. we must scan all pairs of clauses (Ni, Pi)(Nj, Pj), where i = 1, 2, ...; j = i+1, i+2, ... To achieve this, let us use the following pair enumeration process:
(N1, P1)(N2, P2) - first, scan all pairs (i, j) with j=2, i<j.
(N1, P1)(N3, P3), (N2, P2)(N3, P3)- after this, scan all pairs (i, j) with j=3, i<j.
(N1, P1)(N4, P4), (N2, P2)(N4, P4), (N3, P3)(N4, P4) - after this, scan all pairs (i, j) with j=4, i<j.
Etc.
The process will stop, when we arrive at the level j, and the main list contains less than j (in fact, j-1) clauses. For a set of n atoms, there are only 3n different clauses. For example, for two atoms A, B, there are 9 different clauses: ~Av~B, ~AvB, Av~B, AvB, ~A, A, ~B, B, and the empty clause (contradiction). I.e., if we will prohibit duplicate clauses in the main list, then our process will always stop.
Thus, the following pseudo-code will do (no string processing, no expression parsing necessary!):
function propositional resolution (initial
list) { of clauses }
begin
if initial list contains contradiction then
return TRUE { contradiction found }
main list = eliminate duplicates (initial list)
for j = 2 by 1
begin
- if count (main list) < j then
return FALSE { no contradiction derived }
- else
- for i = 1 to j-1 by
1
- - { consider i-th and j-th clauses in the main
list: (Ni, Pi), (Nj, Pj)
}
- - if (Ni^Pj)U(Pi^Nj)
contains exactly one element C then { ^ means
set intersection }
- - begin
- - - {apply resolution}
- - - if (NiUNj-{C}, PiUPj-{C})
not in main list then
- - - begin
- - - - add it to main list
- - - - if main list contains contradiction then
return TRUE { contradiction derived }
- - end
- - - end
end
end
Exercise 5.5.4. Develop a computer program implementing the above pseudocode.
Note. See my version of the program in C++: header file, implementation, download the entire Borland C++ project (200K zip).
Jacques Herbrand (1908-1931) "... After leaving Goettingen, Herbrand decided on a holiday in the Alps before his intended return to France. However he was never to complete his plans for he died in a mountaineering accident in the Alps only a few days after his holiday began. His death at the age of 23 in one of the tragic losses to mathematics." (according to MacTutor History of Mathematics archive).
Herbrand proved his famous theorem in 1929:
J.Herbrand. Recherches sur la th�orie de la d�monstration. Ph.D. Thesis, University of Paris, 1930 (approved in April 1929). (Buy the original at Simon Finch Rare Books Online).
Unlike the proof presented below, the original proof of Herbrand's Theorem does not depend on Goedel's Completeness Theorem (or Model Existence Theorem). Herbrand completed his Ph.D. thesis in 1929. In the same 1929 Goedel completed his doctoral dissertation about completeness (see Section 4.3). In fact, Herbrand's method allows proving of Goedel's Completeness Theorem, but he (Herbrand) "did not notice it". Why? See
Samuel R. Buss. On Herbrand's Theorem. "Lecture Notes in Computer Science", Vol.960, 1995, Springer-Verlag, pp.195-209 (see online at http://math.ucsd.edu/~sbuss/ResearchWeb/herbrandtheorem/).
The flavour of this famous theorem can be best presented in its simplest version. In this version, F(x) is a quantifier free formula containing only one variable x. Herbrand's Theorem says:
The formula ExF(x) is logically valid, iff there is a finite set of constant (closed) terms t1, ..., tn such that the disjunction F(t1)v...vF(tn) is logically valid.
Or, equivalently (via Goedel's Completeness Theorem),
The formula ExF(x) is provable in the classical logic, iff there is a finite set of constant (closed) terms t1, ..., tn such that the disjunction F(t1)v...vF(tn) is provable in the classical logic.
As we will see in the proof, Herbrand's Theorem is "caused" by the simple "fact" that in any proof of ExF(x) only a finite set of terms could be used.
Now, more precisely.
Let L be a first order language, containing at least one constant letter, and let F be a quantifier free formula.
Idea #1 (Author?). The formula p(c1) & q(c2, f(x)) is quantifier free (c1, c2 are constant letters, f - a function letter, p, q - predicate letters). In a sense, any "closed" interpretation domain for this formula must contain objects denoted by the terms c1, c2, f(c1), f(c2), f(f(c1)), f(f(c2)),...
So, let us define the so-called Herbrand's universe of the formula F (let us denote it by HUF) as the minimum set of all constant (closed) terms such that:
a) If c is a constant letter occurring in F, then c is in HUF.
b) If F does not contain constant letters, then one of the constant letters of the language L is in HUF.
c) If terms t1, ..., tk are in HUF, and f is a k-ary function letter occurring in F, then the term f(t1, ..., tk) is in HUF.
Exercise 5.6.1. Verify that HUF is a non-empty finite or countable set (provide an algorithm generating the members of HUF).
Theorem 5.6.1 (Herbrand's Theorem - the simplest case). Let L be a first order language, containing at least one constant letter, and let F(x) be a quantifier free formula containing only one free variable x. Then the formula ExF(x) is logically valid (i.e. provable in the classical predicate logic), iff there is a finite set of terms t1, ..., tn from HUF such that the disjunction F(t1)v...vF(tn) is logically valid (i.e. provable in the classical predicate logic).
Proof. Let us assume the contrary - that none of the disjunctions F(t1)v...vF(tn) is logically valid (ti-s are terms from HUF). Idea #2 - then the following theory T is consistent:
T = { ~F(t) | t is a term from HUF}.
Indeed, if T would be inconsistent, then there would be a T-proof of some formula B&~B. In this proof, only a finite set of the axioms ~F(t) would be used, i.e. for some terms t1, ..., tn from HUF:
[L1-L15, MP, Gen]: ~F(t1), ..., ~F(tn) |- B&~B.
Hence, by Deduction Theorem 2 (it is applicable here, because F(x) contains only one free variable, and ti-s are constant terms, i.e. every ~F(ti) is a closed formula):
[L1-L15, MP, Gen]: |- ~F(t1)&... &~F(tn) -> B&~B,
[L1-L15, MP, Gen]: |- ~(F(t1)v... vF(tn)) -> B&~B,
and thus,
[L1-L15, MP, Gen]: |- F(t1)v... vF(tn).
I.e., F(t1)v... vF(tn) is logically valid. This contradicts our assumption, that none of the disjunctions F(t1)v...vF(tn) is logically valid. Hence, T is a consistent theory.
Idea #3 - if T is consistent, then, by the Model Existence Theorem, there is a model J of T. In this model, all the axioms of T are true, i.e. so are all the formulas ~F(t) with t from the set HUF.
Idea #4 - let us restrict the domain of the model J to those elements of it, which are interpretations of terms from the set HUF, and let us restrict the entire interpretation correspondingly. Let us denote this new interpretation by J1. Then,
a) All the formulas ~F(t) (with t from the set HUF) are true in J1. Indeed, ~F(t) contains only constant terms from HUF (idea #1 working!), and all of them have the same interpretations in J1 that they had in J. Thus, if ~F(t) was true in J, it remains true in J1.
b) Hence, the formula Ax~F(x) is true in J1 (because the domain of J1 consists only of those elements, which are interpretations of terms from the set HUF).
c) Hence, the formula ExF(x) is false in J1.
This contradicts the logical validity of ExF(x).
Q.E.D.
Exercise 5.6.2. Repeat the above proof, proving a more general form of Herbrand's Theorem:
Theorem 5.6.2 (Herbrand's Theorem - the simplest case). Let L be a first order language, containing at least one constant letter, and let F(x1, ..., xm) be a quantifier free formula containing only m free variables x1, ..., xm. The formula Ex1...Exm F(x1, ..., xm) is logically valid, iff there is a finite set of m-tuples tt1, ..., ttn of terms from HUF such that the disjunction F(tt1)v...vF(ttn) is logically valid.
As you verified it in the Exercise 4.1.3, any formula G is logically valid, iff ~G is unsatisfiable. Thus, Ex1...Exm F(x1, ..., xm) is logically valid, iff Ax1...Axm ~F(x1, ..., xm) is unsatisfiable. On the other hand, F(tt1)v...vF(ttn) is logically valid, iff ~F(tt1)&...&~F(ttn) is unsatisfiable. Now, let us replace F by ~F, and we have proved
Theorem 5.6.3 (Herbrand's Theorem - a more useful alternative form). Let L be a first order language, containing at least one constant letter, and let F(x1, ..., xm) be a quantifier free formula containing only m free variables x1, ..., xm. The formula Ax1...Axm F(x1, ..., xm) is unsatisfiable (i.e. inconsistent in the classical logic), iff there is a finite set of m-tuples tt1, ..., ttn of terms from HUF such that the conjunction F(tt1)&...&F(ttn) is unsatisfiable (i.e. inconsistent in the classical logic).
Note. As you verified it in the Exercise 4.3.6, a set of formulas is inconsistent in the classical logic, iff it is unsatisfiable.
Why is this form "more useful"? Let us try applying this form of Herbrand's Theorem to sets of formulas in clause form.
1) The "meaning"of any set of closed formulas F1, ... , Fk is represented by their conjunction F1& ... &Fk.
2) A clause is any disjunction of atomic formulas or their negations. For example, ~p(c1) v p(c2) v q(x, f(y)), or p(x) v ~q(y, f(z)). The "meaning" of a set of clauses is represented by their universally quantified conjunction. For example, AxAyAz([~p(c1) v p(c2) v q(x, f(y))] & [p(x) v ~q(y, f(z))]).
3) As we know from the previous Section 5.4, the set F1, ... Fk can be reduced to a clause from, i.e. there is a set of clauses S such that F1, ... , Fk is unsatisfiable, iff S is unsatisfiable.
Now, let us apply the above form of Herbrand's Theorem (Theorem 5.6.3). If S contains m variables (of course, all of them are universally quantified), then S is unsatisfiable, iff there is a finite set of m-tuples tt1, ..., ttn of terms from HUS such that the conjunction S(tt1)&...&S(ttn) is unsatisfiable.
If we take a clause from S, and substitute some terms from HUS for all its variables, then we obtain a (so-called) ground clause of S. For example, if
S = { ~p(c1) v p(c2) v q(x, f(y)); p(x) v ~q(y, f(z)) },
then the substitution { c1 / x; c2 / y; f(c2) / z } yields the following two ground clauses:
~p(c1) v p(c2) v q(f(c1), f(c2)),
p(c1) v ~q(c2, f(f(c2))).
Of course, the conjunction S(tt1)&...&S(ttn) is a set of ground clauses. Thus, if S is unsatisfiable, then there is an unsatisfiable finite set of ground clauses of S. And conversely?
If there is an unsatisfiable finite set C = {C1, ..., Cn}of ground clauses of S, then each Ci is generated by some substitution, which can be represented as an m-tuple tti of terms from HUS. If {C1, ..., Cn} is unsatisfiable, then {S(tt1), ..., S(ttn)} - as a super-set of the former, is unsatisfiable, too ("even more unsatisfiable").
Now, if S would be satisfiable, then (because all the variables of S are meant universally quantified) so would be the formula S(tt1)& ...&S(ttn). Contradiction.
Thus, we have proved another form of Herbrand's Theorem.
Theorem 5.6.4 (Herbrand's Theorem - the most useful form. Author - Herbert B.Enderton?). Let L be a first order language, containing at least one constant letter, and let F1, ..., Fk be a set of closed formulas in L. Then this set is unsatisfiable, iff its clause form allows an unsatisfiable finite set of ground clauses.
Why is this form "the most useful"? Because (let us ignore performance problems),
a) The clause form of F1, ..., Fk is a finite set S, generated by a simple (but a very slow) algorithm (see Sections 5.1-5.4).
b) Herbrand's universe HUS is a finite or infinite set of constant terms, generated by a simple algorithm (see Exercise 5.6.1).
c) Thus, all the possible finite sets of ground clauses of S can be generated by a simple combination of the above two algorithms.
d) Unsatisfiability of each finite set of ground clauses can be detected by a simple (but a very slow) algorithm (see Lemma 5.6.5 below).
Thus, we have here a simple (but a very slow) algorithm for checking provability in the classical predicate logic.
Lemma 5.6.5. A finite set of ground clauses is unsatisfiable, iff the conjunction of these clauses is unsatisfiable under the classical truth tables.
Proof. In the above example of ground clauses:
~p(c1) v p(c2) v q(f(c1), f(c2)),
p(c1) v ~q(c2, f(f(c2))),
we have 5 different atoms: p(c1), p(c2), q(f(c1), f(c2)), q(c2, f(f(c2))). Let us denote these atoms by Q1, Q2, Q3, Q4. Thus we obtain the following propositional formula
(~Q1 v Q2 v Q3) & (Q1 v ~Q4).
1. If this formula cannot be satisfied under the classical truth tables, then we cannot assign truth values to predicates p, q in a way making all the corresponding clauses true. I.e. then the corresponding set of ground clauses also cannot be satisfied. Q.E.D.
2. If this formula can be satisfied under the classical truth tables, then we can find a truth value assignement making it true, for example:
Q1=false (this makes the first disjunction true),
Q4=false (this makes the second disjunction true).
Now, we can define the following interpretation J making the corresponding ground clauses true:
DJ = { c1, c2, f(c1), f(c2), f(f(c2) } (the set of all terms appearing in the clauses, i.e. a subset of the Herbrand universe);
p(c1)=false, q(c2, f(f(c2))=false (these assignements make both ground clauses true).
All the other truth values are irrelevant, so, we can define them, for example, as follows:
p(c2)=true, p(f(c1))=true, p(f(c2))=true, p(f(f(c2))=true;
q(x, y)=true, if x is not c2, or y is not f(f(c2).
Q.E.D.
...
To be continued.
...
Further reading:
Eric Evangelista. A Model-Theoretic Approach to Herbrand�s Theorem (see at http://www.sfu.ca/~eevangel/cs/docs/herbrand.pdf).
Michael Genesereth. Herbrand Method (see at http://logic.stanford.edu/classes/cs157/2003/lectures/lecture07.pdf).
If we are interested only in deriving contradictions from inconsistent sets of formulas, then we can note that a set of closed predicate formulas is inconsistent (i.e. allows deriving a contradiction in the classical logic), iff the conjunction of these formulas is unsatisfiable (Exercise 4.3.6). Thus, instead of the initial set, we can analyze the set of clause forms of these formulas. Indeed, if we derive a contradiction from the set of clause forms, then this set is unsatisfiable, i.e., by Theorem 5.4.2, so is the initial set, and hence, the initial set is inconsistent. And conversely, if the initial set of formulas is consistent, then it is satisfiable, i.e. so is the set of clause forms, i.e. we will be not able to derive a contradiction from it.
The next step forward - in clause forms, we can drop all the universal quantifiers. Indeed, if we derive a contradiction from a set universally quantified clause forms, then we can derive it from the corresponding non-quantified set (we can apply the Gen inference rule F(x) |- AxF(x) to obtain the quantified forms from the non-quantified ones). And conversely, if we derive a contradiction from a set of non-quantified clause forms, then we can derive it from the corresponding universally quantified set (apply the Axiom L12: AxF(x) -> F(t) to obtain non-quantified forms from the quantified ones).
After dropping quantifiers, sets of clause forms become simply sets of clauses (conjunction of conjunctions is equivalent to a "joint" conjunction).
Thus, we can concentrate on sets of clauses that do not contain quantifiers, like as the one obtained in Section 5.4:
-> g(u)>u,
-> g(u)>1,
y>1, z>1, g(u)=y*z ->.
Note that clauses consist of atomic formulas only.
Another step forward - in clause forms, we can rename variables in such a way that two clauses do not contain common variables. For example, we can replace the above set of clauses by the following one:
-> g(u1)>u1,
-> g(u2)>1,
y>1, z>1, g(u3)=y*z ->.
Indeed,
Exercise 5.7.1. Return to Section 3.3 and verify that, in the minimal logic
[L1-L9, MP]: |-
Ax(B(x)&C(x)) <-> Ax1Ax2[B(x1)&C(x2)].
What does it mean?
[L1-L9, MP]: |- Ax(B(x)&C(x)&D(x))
<-> Ax1Ax2Ax3[B(x1)&C(x2)&
D(x3)]. Etc.
At first glance, this step may seem "redundant". Still, note that, in fact, it allows complete separation of clauses "by the meaning", and this separation will greatly simplify processing of clauses by means of substitution (see below).
Will the Robinson's resolution rule remain a universal tool for deriving contradictions also from inconsistent sets of predicate formulas (i.e. sets of non-quantified clauses, consisting of atomic formulas)?
Let us imagine, we have derived the following two formulas (p is a unary predicate letter, 0 - a constant letter):
p(x1) v F(x1, y1), ~p(0) v G(x2, y2).
To apply the Robinson's resolution rule, we must first, in p(x1), substitute 0 for x1:
p(0) v F(0, y1), ~p(0) v G(x2, y2).
Now, we can apply the resolution rule, obtaining the formula
F(0, y1) v G(x2, y2).
Surprisingly, this simple idea of "unification by substitution" appears to be sufficient to make Robinson's resolution rule a universal tool for deriving contradictions also from inconsistent sets of predicate formulas! And, in general, the necessary substitutions are not much more complicated than in the above simplest example.
Theorem 5.7.1 ( J.A.Robinson). In the classical predicate logic [L1-L11, L12-L15, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and substitution rules) allows deriving a contradiction from it.
Note. In some other texts, this fact is called "the refutation-completeness of the resolution rule".
Proof. 1. All the formulas, derived from a set of clauses K1, K2, ... , Ks by using permutation, reduction, substitution and resolution rules, are consequences of K1, K2, ... , Ks. Hence, if these rules allow deriving a contradiction from this set of clauses, then it (the set) is inconsistent.
2. Now, let us assume that the set of clauses S = {K1, K2, ... , Ks} is inconsistent. Then it is unsatisfiable (Exercise 4.3.6). And then, by Herbrand's Theorem, it allows a finite unsatisfiable set of ground clauses C1, ..., Cn. Each Ci of these ground clauses is obtained from some clause in S by means of some substitution subi (of terms from the Herbrand universe HUS), i.e. by applying the substitution rule.
By Lemma 5.6.5, the set C1, ..., Cn is unsatisfiable, iff the conjunction C1& ... &Cn is unsatisfiable under the classical truth tables, i.e., iff the set C1, ..., Cn is inconsistent. And, by Theorem 5.5.1, a finite set of propositional clauses is inconsistent; iff Robinson's resolution rule (together with permutation and reduction rules) allows deriving a contradiction from it.
Q.E.D.
Refinements - Step 1 (First of the Two Smart Ideas)
Let us examine once more the part two of the proof of Theorem 5.7.1, where a specific (hopeless!) "proof strategy" is used.
First, since two clauses Ki do not contain common variables, we can think that each of the substitutions subj is applied to a single clause, i.e. we can think, in fact, of a (finite) set of substitutions subij, where each subij is applied only to the clause Ki. Let us denote by F.sub the result of application of the substitution sub to the formula F.
Second, to derive a contradiction from {K1, K2, ... , Ks}, we may apply, first, all the necessary substitutions (stage 1 - substitutions only!), and, after this, all the necessary permutations, reductions and resolutions (stage 2 - no more substitutions!). This is exactly the above-mentioned specific (hopeless!) "proof strategy". Why hopeless? Because, before applying the substitutions subij, we must find them among all the possible substitutions of terms from the infinite set HUS. This is a performance problem that does not affect our above theoretical considerations, but could make their result useless. The smart ideas #1 and #2 introduced below, allow to restrict the substitution search area considerably.
Imagine one of the resolutions of stage 2, where C1 is an atomic formula:
F1vC1, ~C1vG1
----------------,
F1vG1
If both premises F1vC1, ~C1vG1 are coming directly from stage 1, then they have been obtained from some initial clauses FvC, ~DvG by two substitutions sub1 and sub2 such that:
F1 is F.sub1, C1 is C.sub1, ~C1 is ~D.sub2, G1 is G.sub2.
We can call such pair of substitutions a unifier, because C.sub1 and D.sub2 represent the same atomic formula (compare the example before the text of Theorem 5.7.1).
If one (or both) of the premises does not come directly from stage 1, then it is either an initial clause, or the result of a previous resolution. By putting an empty substitution (which does no change formulas) instead of sub1 or sub2 (or both) we can still think of the premises as obtained by a unification.
And, finally, if, to derive a contradiction B, ~B from K1, K2, ... , Ks, we do not need resolution at all, then we need, nevertheless, unifying substitutions, converting two clauses B1 and ~B2 into B and ~B.
Thus (smart idea #1), to derive contradictions, we can do with one specific kind of the substitution rule - the unification rule:
a) Take two clauses, mark a positive atom C in the first clause, and a negative atom ~D in the second one. Thus, we are considering two clauses: FvC and ~DvG.
b) Try to find two substitutions sub1 and sub2 such that C.sub1 and D.sub2 represent the same atom C1. And you do not need to introduce variables of the other clauses! If you succeed, you have obtained two clauses: F1vC1, ~C1vG1, where C1 is C.sub1 (=D.sub2), F1 is F.sub1 and G1 is G.sub2. Since clauses do not contain common variables, the union sub1Usub2 is a substitution (a unifier of C and D).
c) Apply resolution, obtaining the clause F1vG1.
We have proved the following refined version of Theorem 5.7.1:
Theorem 5.7.2 ( J.A.Robinson). In the classical predicate logic [L1-L11, L12-L15, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and unification rules) allows deriving a contradiction from it.
Why is this refinement important? Because now, instead of trying out all the possible substitutions (of terms from HUS for clause variables), we can concentrate on substitutions that unify two clauses. This allows to restrict the substitution search area considerably.
Refinements - Step 2 (Second of the Two Smart Ideas)
Substitution "Algebra"
In general, each substitution involves a list of distinct variables x1, ..., xk and a list of terms t1, ...,tk. All occurrences of the variable xi are replaced by the term ti. Thus, this operation can be most naturally represented by the set of pairs { t1 / x1, ..., tk / xk }. The order of pairs ti / xi is irrelevant because of the following "anti-cascading" condition: the new occurrences of the variables x1, ..., xk created by the substitution, are not replaced. The result of application of some substitution sub to some expression (term or formula) F, is usually denoted by F.sub.
For example, if F is p(x, f(y)) and sub = { f(z) / x, z / y }, then F.sub is p(f(z), f(z)).
The empty set of pairs {} represents the so-called empty substitution. Of course, F.{} = F, for any expression F.
If the variable sets of two substitutions sub1 and sub2 do not intersect, and the terms of sub1 do not contain the variables of sub2, and the terms of sub2 do not contain the variables of sub1, then the union sub1Usub2 (of two sets of pairs) defines a substitution.
Still, the most important operation on substitutions is composition. If sub1 and sub2 are two substitutions, then sub1.sub2 denotes the composed substitution "apply first sub1, and after this, apply sub2". For example, if sub1 = { f(z) / x, z / y } and sub2 = { f(w) / z }, then
sub1.sub2 = { f(f(w)) / x, f(w) / y, f(w) / z }.
Exercise 5.7.2. a) Verify that the substitution composition is associative and non-commutative (provide a counter-example), and that the empty substitution is the only "unit element" (i.e. {}.sub = sub.{} = sub for any substitution sub). b) Is there any algebraic correlation between composition and union of substitutions?
Most General Unifiers
How do behave unifiers in the substitution "algebra"? Assume, sub1 and sub2 are two different unifiers of the same pair of expressions F and G. I.e.
F.sub1 = G.sub1, F.sub2 = G.sub2.
If there would be a substitution sub such that sub2=sub1.sub, then we could say that sub1 is a no less general unifier than sub2. For example, let us try to unify the first members of the following two formulas:
p(x1) v F(x1, y1), ~p(f(x2)) v G(x2, y2).
It would be natural to use the substitution sub1 = { f(z) / x1, z / x2 }, obtaining
p(f(z)) v F(f(z), y1), ~p(f(z)) v G(z, y2).
But, in principle, one could use also the substitution sub2 = { f(f(z)) / x1, f(z) / x2 }, obtaining
p(f(f(z))) v F(f(f(z)), y1), ~p(f(f(z))) v G(f(z), y2).
Of course, sub1 is "better", because sub2 = sub1.{ f(z) / z }. Why? If our purpose was unifying p(x1) with p(f(x2)), then sub1 performs this (as well as sub2), but it "leaves more space" for subsequent substitutions (than sub2). Indeed, to continue after sub1, instead of sub2 = sub1.{ f(z) / z }, we can choose also sub3 = sub1.{ g(z) / z } etc. Thus, using a more general unifier is preferable.
So, let us call a unifier sub of two expressions F and G a most general unifier (mgu) of F and G, iff it is no less general than any other unifier of F and G (i.e. iff, for any other unifier sub' of F and G, there is a substitution sub'' such that sub' = sub.sub'').
Lemma 5.7.3. If two expressions lists FF and GG are unifiable, then there exists an mgu of FF and GG.
Proof (long, but easy). Let us define the total length of an expression list as follows: a) (atomic expressions) the total length of a constant or of a variable is 1, b) the total length of the expression list e1, ..., en is the sum of the total lengths of the members e1, ..., en, c) (composite expressions) the total length of the expression f(t1, ..., tn) (where f is function letter or predicate letter), is the total length of the expression list t1, ..., tn plus 1.
Let us prove our Lemma by induction using min(total_length(FF), total_length(GG)) as the induction parameter.
1) Induction base. The total length of FF or GG is 1. Let us assume total_length(FF)=1.
a) FF is a constant c. Then FF and GG are unifiable, iff GG is the same constant c. Then, empty substitution is the only possible mgu (verify).
b) FF is a variable x. Then, FF and GG are not unifiable, if: b1) GG is a list of more than one expression, or, b2) GG is a composite expression that contains x (then any substitution of t for x makes GG longer than t). And, FF and GG are unifiable, iff GG is a variable, or GG is a composite expression that does not contain x.
If GG is the variable x, then the empty substitution is the only possible mgu (verify).
If GG is a variable y (other than x), then all unifications of FF and GG have the form { t / x, t / y, ... }, where t is any term. Among them, mgu-s are { z / x, z / y }, where z is any variable (verify).
If GG is a composite expression that does not contain x, then all unifications of FF and GG have the form { GG.sub / x, ... } U sub, where sub is any substitution that does not substitute for x (verify). Among them, mgu-s are { GG.sub / x}U sub, where sub substitutes distinct variables for distinct variables (verify).
This completes the induction base.
2) Induction step. Assume, min(total_length(FF), total_length(GG))=n, where n>1. If FF and GG are unifiable, then, as lists, they contain the same number of members.
2a) FF and GG contain are single expressions. Since min(total_length(FF), total_length(GG))>1, both are composite expressions - suppose, FF is f(s1, ..., sm) (where f is function letter or predicate letter, and s1, ..., sm are terms), and GG is g(t1, ..., tn) (where g is function letter or predicate letter, and t1, ..., tn are terms). FF and GG are unifiable, iff a) f and g represent the same letter, and b) the lists s1, ..., sm and t1, ..., tn are unifiable. Thus, the unifiers of FF and GG coincide with the unifiers of lists. Since min(total_length(s1, ..., sm), total_length(t1, ..., tn))<n, by the induction assumption, Lemma 5.7.3 holds for the lists, i.e. it holds also for FF and GG.
2b) FF and GG contain two or more members. If FF and GG are unifiable, then so are their first members ("heads") F1 and G1. Let us denote by FF2 and GG2 the rests of lists ("tails"). Since min(total_length(F1), total_length(G1))<n, by the induction assumption, there exists at least one mgu of F1 and G1. The same is true also for FF2 and GG2.
Let us denote by mgu1 an arbitrary mgu of F1 and G1
Now, let us consider an arbitrary unifier u of FF and GG. It must unify also F1 with G1, and FF2 with GG2. Hence, u = mgu1.sub1, where sub1 is some substitution. We know that F1.mgu1 = G1.mgu1.
But what about FF2.mgu1 and GG2.mgu1? Let us apply sub1 to both:
FF2.mgu1.sub1 = FF2.u
GG2.mgu1.sub1 = GG2.u
Since u unifies FF2 with GG2,
FF2.mgu1.sub1 = GG2.mgu1.sub1,
i.e. sub1 unifies FF2.mgu1 with GG2.mgu1. Let us denote by mgu12 an arbitrary mgu of FF2.mgu1 and GG2.mgu1. Then, sub1 = mgu12.sub12, where sub12 is some substitution, and
mgu1.mgu12.sub12 = mgu1.sub1=u.
Thus, we have established that for an arbitrary unifier u of FF and GG there is a substitution sub12 such that mgu1.mgu12.sub12 = u. Of course, the composition mgu1.mgu12 unifies FF with GG (since it unifies F1 with G1, and FF2 with GG2). Hence, mgu1.mgu12 is an mgu of FF and GG.
Q.E.D.
Unification Algorithm
How could we determine, can two atomic formulas C and D be unified, or not? This problem can be solved by the following simple pseudo-code GetMostGeneralUnifier, which follows the above proof of Lemma 5.7.3, and where expression lists are defined in LISP style:
1) Each variable, constant, function letter and predicate letter is an expression list (consisting of a single member).
2) If s1, ..., sn are expression lists, then the list of s1, ..., sn is an expression list (consisting of members s1, ..., sn). The first member s1 is called the head of the list, and the list of s2, ..., sn - the tail of the list.
Thus, instead of, for example, f(t1, .., tn), we use simply the (LISP style) list f, t1, .., tn. This simplifies the recursion interface.
This program detects, are two expression lists unifiable, or not, and, if they are, it returns one of their most general unifiers.
function GetMostGeneralUnifier
(expression_list1, expression_list2)
begin
if length(expression_list1) > 1 and
length(expression_list2) > 1 then
begin
--- h1 = head(expression_list1);
--- h2 = head(expression_list2);
--- subH = GetMostGeneralUnifier(h1, h2);
--- if subH = false then return
false; {unification impossible}
--- t1 = tail(expression_list1).subH;
--- t2 = tail(expression_list2).subH;
--- subT = GetMostGeneralUnifier(t1, t2);
--- if subT = false then return
false; {unification impossible, note that subH is a mgu!}
--- return subH.SubT; {this composition unifies
expression_list1 and expression_list2}
end
{now, expression_list1, or expression_list2 consists of a single
member: m1 or m2}
if length(expression_list1) = 1 and
m1 is variable then
begin
--- if m1 = expression_list2 then return
{}; {empty substitution}
--- if m1 occurs in expression_list2 then
return false; {unification impossible - verify!}
--- return {expression_list2 / m1}; {substitute
expression_list2 for m1}
end
if length(expression_list2) = 1 and
m2 is variable then
begin
--- if m2 = expression_list1 then return
{}; {empty substitution}
--- if m2 occurs in expression_list1 then
return false; {unification impossible - verify!}
--- return {expression_list1 / m2}; {substitute
expression_list1 for m2}
end
{now, expression_list1, or expression_list2 consists of a single
member that is not variable}
if expression_list1 = expression_list2 then return
{}; {empty substitution}
return false; {unification impossible - verify!}
end
Exercise 5.7.3. Verify that this program detects, are two expression lists unifiable, or not, and, if they are, it returns one of their mgu-s. (Hint: repeat the proof of Lemma 5.7.3.)
Smart idea #2:
To derive contradictions, we can do with even more specific kind of the unification rule - the mgu-rule:
a) Take two clauses, mark a positive atom C in the first clause, and a negative atom ~D in the second one. Thus, we are considering two clauses: FvC and ~DvG.
b) Try to find any mgu of C and D. If you succeed, you have obtained two clauses: F.mgu v C1, ~C1 v G.mgu, where C1 is C.mgu (=D.mgu).
c) Apply resolution, obtaining the clause F.mgu v G.mgu.
Theorem 5.7.4 ( J.A.Robinson). In the classical predicate logic [L1-L11, L12-L15, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and mgu-rules) allows deriving a contradiction from it.
Why is this (second!) refinement important? Because now, instead of trying out all the possible unifications, we can concentrate on mgu-s that unify two clauses. This allows to restrict the substitution search area even more (when comparing with Theorem 5.7.2).
The hard part of the proof is inventing of the following
Lemma 5.7.5. Any proof K1, K2, ... , Ks |- K (all K-s are clauses), where only permutation, reduction, substitution and resolution rules are used, can be converted into a proof K1, K2, ... , Ks |- K' such that: a) in the proof, only permutation, reduction, mgu and resolution rules are used; b) K can be obtained from K' by a single (possibly empty) substitution, followed by a chain of permutations and reductions.
Proof of Theorem 5.7.4. Assume, the set of clauses K1, K2, ... , Ks is inconsistent. Then, by Theorem 5.7.1, there are two proofs K1, K2, ... , Ks |- B, K1, K2, ... , Ks |- ~B, where where only permutation, reduction, substitution and resolution rules are used. From clauses, these rules allow deriving only of clauses. Hence, B is an atomic formula.
By Lemma 5.7.5, both proofs can be converted into proofs K1, K2, ... , Ks |- B1, K1, K2, ... , Ks |- ~B2 such that: a) in the proofs, only permutation, reduction, mgu and resolution rules are used; b1) B can be obtained from B1 by a single (possibly empty) substitution (permutations and reductions do not apply to atomic formulas), b2) B can be obtained from B2 by a single (possibly empty) substitution.
Thus, B1 and B2 are unifiable. Let us take their mgu, and apply it. As the result, we obtain a contradiction B', ~B', where B' is B1.mgu (= B2.mgu). And we have obtained this contradiction from the clauses K1, K2, ... , Ks by using only permutation, reduction, mgu and resolution rules. Q.E.D.
Proof Lemma 5.7.5.. Induction by the "height of the resolution tree" (see below).
1. Induction base - no resolutions applied in the proof K1, K2, ... , Ks |- K. Then K is obtained from some Ki by a chain of permutations, reductions and substitutions. Add to this fact an "empty" proof K1, K2, ... , Ks |- Ki. And let us compose all the substitutions into a single substitution. Q.E.D.
2. Induction step. Assume, we have the proof K1, K2, ... , Ks |- K, containing at least one resolution. Imagine the last resolution in this proof (C is an atomic formula):
FvC, ~CvG
----------------.
FvG
Then K is obtained from the formula FvG by a chain of permutations, reductions and substitutions.
The proofs of the formulas FvC, ~CvG possess a "height of the resolution tree" less than the one of the proof K1, K2, ... , Ks |- K. Thus, by induction assumption, we can convert these proofs into permutation-reduction-mgu-resolution proofs of some formulas F1vC1vF2 and G1v ~C2vG2 such that:
a) FvC can be obtained from F1vC1vF2 by a single (possibly empty) substitution sub1, followed by a chain of permutations and reductions. Under sub1, the atomic formula C1 is converted into C.
b) ~CvG can be obtained from G1v ~C2vG2 by a single (possibly empty) substitution sub2, followed by a chain of permutations and reductions. Under sub2, the atomic formula C2 is converted into C.
Since the clauses F1vC1vF2 and G1v ~C2vG2 do not contain common variables, the substitutions sub1 and sub2 do not intersect, hence, their union sub1Usub2 is a substitution sub (a unifier of C1 and C2) such that:
a1) F can be obtained from (F1vF2).sub by a chain of permutations and reductions.
b1) G can be obtained from (G1vG2).sub by a chain of permutations and reductions.
As we know from the above, the atomic formulas C1 and C2 are unifiable. Let us take their mgu, and apply it to the formulas F1vC1vF2 and G1v ~C2vG2. Let us denote by C' the formula C1.mgu (it is equal to C2.mgu). Thus, we have two formulas F1.mgu v C' v F2.mgu and G1.mgu v ~C' v G2.mgu, and, by permutation and resolution, we can obtain the formula (F1vF2).mgu v (G1vG2).mgu.
Thus, for the formula (F1vF2).mgu v (G1vG2).mgu, we have a permutation-reduction-mgu-resolution proof. It remains to show that, from this formula, FvG can be obtained by a single substitution, followed by a chain of permutations and reductions.
Since the substitution sub is a unifier of C1 and C2, then, by the definition of mgu, sub=mgu.sub', where sub' is some substitution. Hence,
a2) F can be obtained from (F1vF2).mgu by the substitution sub', followed by a chain of permutations and reductions.
b2) G can be obtained from (G1vG2).mgu by the substitution sub', followed by a chain of permutations and reductions.
Thus, FvG can be obtained from (F1vF2).mgu v (G1vG2).mgu by the substitution sub', followed by a chain of permutations and reductions. Q.E.D.
...
To be continued.
Further reading:
Resolution by Giorgio Ingargiola
Rajjan Shinghal. Formal Concepts in Artificial Intelligence. Fundamentals. Chapman&Hall, 1992, 666 pp.
Handbook of Automated Reasoning, ed. by A.Robinson and A.Voronkov, Elsevier Science, 2001, vol. I, II
resolution principle, Robinson, Herbrand theorem, normal form, clausal, clause, form, resolution method, normal, clausal form, clause form, resolution, Herbrand, theorem, unification, mgu, unifier, most general unifier