Goedel’s Incompleteness Theorem and the Emergence of AI

Artificial Intelligence Pioneer, AI Advisor, Key-Note Speaker, Author


In 1931 at the age of just 25 years, the young Austrian mathematician Kurt Goedel proved an astonishing mathematical theorem that made him instantly famous and a celebrity in mathematical circles around the world (see picture; in the Anglo-Saxon literature Goedel is usually referred to as “Godel” skipping the Umlaut in his German name).

Despite its very abstract nature and the lack of any every day practical use, Goedel’s theorem – the so called Incompleteness Theorem – has had a dramatic and deep impact on mathematics itself and its foundations. It also had a substantial impact on the philosophy of the 20th century and our understanding of the general limitations of computers and algorithms.

I will explain here how Goedel’s theorem actually caused the emergence of the new science of Artificial Intelligence (AI) and theoretical computer science in the 1940s and 1950s and how it motivated such key AI pioneers like Alan Turing to get involved. The birth of AI and the course AI has taken ever since can only be appreciated and understood well when one takes Goedel’s theorem and its impact into account. Goedel also made many other breakthrough discoveries even in Einstein’s General Relativity Theory where he produced a solution to Einstein’s differential equations which allows time travel – at least in theory (see picture above; Einstein and Goedel became close friends during their time together at Princeton).

Some leading scientists like Sir Roger Penrose even argue that Goedel showed with his Incompleteness Theorem that today’s computers can never reach human level intelligence or consciousness, that humans will always be smarter than current computers or any computer algorithm can ever be and that computers will never in the true sense of the word “understand” anything like higher level mathematics, especially not mathematics that deals with trans-finite sets and numbers.

What Goedel’s theorem actually says in detail and how it impacts AI even today will become clear and easier to understand after we have had a quick look into the history of mathematics and logic and how and why we got to the point of Goedel’s involvement and theorem.

Georg Cantor – A Universe of Infinities

As a consequence and by-product of the industrial revolution in the 19th century, theoretical physics and mathematics had a kid of “cumbrian explosion” period in the second half and especially in the last quarter of the 19th century. Many famous mathematicians (and physicists) created fascinating new theories and discovered deep and far reaching mathematical results. Among them was the genius German mathematician Georg Cantor.

Cantor (pic on left) invented the so called “set theory” which later on was shown to be able to build a basis and foundation for all branches of mathematics. Set theory is now still taught even in primary schools all over the world to teach children the basic concepts of abstract mathematics. However, Cantor was not considered a genius at first (that only happened about 30 years later) but he was rather strongly attacked by his peers at the time in the 1870s and 1880s.

This was because Cantor single handedly extended our concept of finite sets to infinite sets and far beyond. He famously discovered that once one allows infinite sets to exist as mathematical objects, one can create a whole universe of more infinite sets. He discovered especially that infinity is not the same as infinity. There can be many different kinds of infinities. That was a very surprising result for the mathematicians of his time (and still is for most people today).

He proved for example that the natural numbers “N”: 1, 2, 3, 4, 5, 6, 7… and so on, when taken as a complete set, is the smallest kind of infinite sets possible. This set N has some strange properties that are not possible for finite sets, for example: N has exactly as many members as just half of the set N ! This is easy to see. Take only the subset of even numbers from N: 2, 4, 6, 8, 10…and so on and call this set N2. Then one can put the numbers of N into a 1-1 matching relation with the members of N2 like this : 1 -> 2, 2 -> 4, 3 -> 6, 4 -> 8 and so on, mapping every number n from N to 2n from N2. Therefore, every number of N will be mapped exactly to one number of N2 and vice versa, so both sets have exactly the same number of members. As a matter of fact, that is true to any infinite subset of N, not just N2 !

This feature of N can actually be used to define infinity. A set is infinite if it can be shown that there is a 1-1 mapping between itself and a proper subset of itself. Cantor introduced the intuitive concept of 1-1 mappings for determining and comparing the size of infinite sets (called the cardinality of a set).

This is just a trivial example. Cantor proved much more interesting and surprising results. For example he proved that the rational numbers (the set of all quotients of natural numbers like 3/4, 9/7 or 145/11) can still be counted with the natural numbers N and hence are just as many as N (see pic on left that shows how the rational numbers can be counted). He also proved, that the set of irrational numbers, these are the numbers that cannot be expressed as fractions of natural numbers (like Pi, e or the square root of 2), cannot be counted with the natural numbers. So, the set of irrationals numbers (and hence also the set of real numbers R) has more elements (a higher cardinality) than N and hence cannot be counted.

He proved this by using his famous “diagonal” construction (see pic below) that showed that any supposedly complete enumerated list of irrational or real numbers R will always miss some irrational numbers, thereby proving that a complete enumeration of the real numbers by the natural numbers is impossible. This is somehow surprising as intuitively there seem to be less irrational numbers than rational ones as they are more cumbersome to generate.

The uncountability of the irrational numbers also means, that there are in fact at least two different kinds of infinities: the countable and the uncountable. Cantor has actually shown that there are even an infinite number of ever bigger infinities by showing that the set of all subsets of any given infinite set is always substantially bigger (cannot be put into a 1-1 relation) than the set itself.

These results were all quite disturbing and counter intuitive at first and made his colleague mathematicians feel very uncomfortable about this new realm of mathematics. Cantor had no methods that would construct infinite sets with finite methods that could at least in principle be verified by the existing methods of the time. Never in the history of math had infinities as completed objects been studied seriously. But Cantor had introduced in one stroke of genius a whole new mathematical universe of different infinities.

That was all hard to swallow by his peers even though they realised and had to admit that the new realm of infinitely many infinite sets was fascinating and a beautiful construct with astonishing new mathematical laws governing them. They seemed logically consistent and an extension of finite mathematics into the trans-finite world (although nobody had any idea how this could ever be applied to the real world we live in).

As a religious man, Cantor did not make it easier for his struggling peers to accept his new infinities, new methods, new types of proofs and concepts by claiming that God himself gave him the intuitions for his set theoretic work and the discovery of the world of infinite sets. This “justification” of Cantor for his new set theory and the trans-finite sets was just too much for lots of his peers and unacceptable. A more solid and objective justification was urgently needed. Since there are no finite methods to construct and to verify properties of trans-finite sets, how could one make sure that these sets their properties do not lead to inconsistencies ?

Gottlob Frege – the Godfather of modern Logic and the Philosophy of Language

The fight and arguments between Cantor and his mathematical peers caught the attention of another genius German mathematician named Gottlob Frege (see pic below). Frege was excited on one side about the major progress mathematics had made in many areas in his time but was also appalled by the obvious lack of rigour and exactness and the absence of a solid scientific foundation of mathematics at the time. The creativity the mathematicians had shown in many areas was not matched by the required rigour that should be associated with such an important and fundamental science as mathematics. After all, it was mathematics that has introduced the concept of “proof” and rigour to science since the days of the ancient Greeks.

He was most concerned about the concept that mathematics was something that was created and guided by some person’s “intuitions” like Cantor claimed or other mental activities. His stance was very strongly platonistic and pointing in the opposite direction. For him mathematical truths had to be objective and mathematical statements had to be true or false independent of a mathematicians mental activities or intuitions.

To substantiate his stance, Frege developed his own philosophy of language and mathematics. He wrote a series of brilliant and influential philosophical articles about language and mathematics and especially about the meaning and sense of words, predicates, sentences and concepts that are still considered classics and must-reads today in philosophy classes. His philosophy of language impressed and strongly influenced even such great philosophers like Ludwig Wittgenstein and Betrand Russell and made Frege the father of modern philosophy of language which was especially popular in the Anglo-Saxon world in the 20th century.

Frege realised that there were two major problems in mathematics responsible for all the arguments. First, there was no concise and precise scientific language to describe complex mathematical objects and proofs. Without a precise language there could not be any precise proofs or mathematical concepts. Mathematicians in his days often mixed up undefined and vague expressions from every day natural language with mathematical language and complex mathematical concepts. This in his eyes was a major cause of confusion and errors in mathematical statements and proofs.

Secondly, he realised that mathematics needed a more profound basis upon which one could build all the new concepts and even the trans-finite objects of Cantor. Frege identified logic as this more basic foundation. After all, logic was in a sense much simpler than mathematics and only had a few clear rules and concepts.

He believed therefore that logic was easier to understand and to agree upon and more fundamental than mathematics. In addition, he believed that all mathematical concepts and methods could actually be derived from more basic logical concepts and methods. Whenever mathematicians need to prove a mathematical statement, they need to follow valid logical rules and principles, otherwise the proof could and often would be flawed.

However, the known logic of Frege’s time was trivial and of no much use and consisted essentially still of the thousands of years old syllogisms of the Greek philosophers. Their logic was not sufficient and not expressive enough to handle the complex constructs and proofs of modern science and mathematics.

So, Frege set out to solve these two problems and spent the rest of his academic life and several decades developing a more refined and more powerful logic and a precise scientific language able to describe and express all of modern mathematics.

For this Frege created a completely new 2-dimensional symbolic language (see picture on the right) which he called “Begriffsschrift” in German, or “Concept Language” in English.

He then showed how set theory and arithmetics, the theory of the natural numbers (which lies at the core of pretty much all of mathematics), can be described and formulated using only his new 2-dimensional scientific language combined with his new extensions of classical logic.

He worked several decades on this big task, mostly alone in complete solitude with no other mathematicians co-operating. As a matter of fact, he had major difficulties to publish his book with this Begriffsschrift as it was very expensive at his time to print and publish such a unique 2-dimensional script. He only got it published at the end by paying for the print himself. Despite all his efforts, Frege’s 2-D symbolism and scientific language with its unique and unmatched rigour did not survive him.

What did survive him however, were his new ways to extend classical logic and these are all still used in mathematics and especially in mathematical logic everywhere today.

Classical logic as formulated first by the great ancient Greek philosophers was mainly a propositional logic. It was introduced to allow judgments about the correctness and validity of philosophical arguments and mainly used to define the rules on how to correctly derive propositions from other propositions. One could formulate simple rules like: If A is true and if B is true, then so is “A and B”. Or conclusions like: If A is true and if “A implies B” is true, then B is true. Classical logic also allowed simple judgments about subsumptions like: If all X are Y and if all Y are Z, then all X are Z (for example: if all Greeks are Europeans and all Europeans are humans, then all Greeks are humans as well).

Frege saw that these simple logical rules and laws needed to be extended and refined to formulate even very simple mathematical constructs (like “2+2=4”) and objects like mathematical functions. To solve this problem, he introduced the concept of logical variables (usually denoted as x, y, z) and logical predicates as properties of these variables (usually denoted by P(x), Q(y) or R(x,z)). He gave a lot of thought to the question what would be the meaning and semantics of such variables and predicates, how could they be interpreted correctly (which became the central questions of his philosophy of language). Finally he defined the semantics and meaning of logical predicates as Cantor like sets, as their “extensions” as Frege would say and the meaning of the variables as potential members of such sets.

An extension of a predicate (or its meaning) is therefore simply the set of all objects that fulfil the predicate. For example, he could define a predicate M with one variable x written as M(x). M(x) is true if x fulfils the predicate M. Let’s say M is the predicate “being male”. Then for any x, M(x) is true if and only if x is male (and false if x is female).

If a predicate had more than one variable, it was called a relation. For example the predicate B defined as “bigger than” has two variables x and y with B(x,y) being true if and only if x is bigger than y. So, if B(x,y) is true, then B(y,x) must be false. He also defined the semantics of whole sentences. A well formed mathematical statement must be either true or false (principle of the excluded third).

The meaning of a sentence for Frege is simply its truth value, true or false. To differentiate between the meaning of an expression (what it denotes and stands for) and the more refined aspects of how we understand an expression, he used the term “sense”. So, the meaning of “4” and “2+2” is the same object, but the sense of the two expressions are different. That us why for Frege equations like “a=a” and “4=2+2” have the same meaning, but very different senses and cognitive values. “a=a” is a true statement for any a and does not tell us anything beyond that fact. However, “4=2+2”, as trivial as it seems to be, tells us something about the results of applying the “+” operation to some numbers. “a=a” is a pure tautological statement which is universally true, whereas “4=2+2” is a statement of arithmetics that needs to be proven to be true.

With these new logical concepts of variables and predicates Frege had introduced the equivalent of functions in mathematics to logic using a similar notation as we still use today like when we write “f(x)” to denote the value of a function when applied to a variable x.

His other major innovations were notations for so called universal and existential quantifiers that connected the logical variables and predicates to formulate more intricate nested statements than just the simple propositions of the Greek syllogisms. The quantifiers would allow statements like “for all x with the property P” or “if there exists any x with property P” (see the picture above that shows the currently used one- dimensional logical notations as compared to Frege’s original Begriffsschrift notations). In this extended notation of logic with the new concepts of predicates, logical variables and quantifiers any statement of mathematics can be formulated even today like: “For all x there exists one y, so that: if x /=0 then x*y=1”.

Bertrand Russel – The set of all sets that are not members of themselves

Frege was quite proud of his Begriffsschrift and its power and rigour so he finally published it in 1879. He then worked on formulating and deriving the basic statements of the arithmetic of natural numbers from only logical concepts and principles in his Begriffsschrift. He published his results in a second book about the foundations of arithmetic in 1893 and everything seemed to run well for him and he slowly got recognised as a leader in his field.

However, just about 10 years later in 1902, shortly before he could finish and publish the second volume of his foundations of arithmetic, he received a short notice on a simple postcard (for the younger readers, postcards were the paper equivalents of sms or chat messages today) from the young mathematician Bertrand Russell (see pic below). Russell, a gifted mathematician and logician and later famous philosopher and Nobel Prize winning writer, alerted Frege that he constructed a contradictory set within Frege’s framework of the foundations of arithmetic!

Frege was completely shocked. However, he saw and admitted immediately that Russell was correct. Russell’s short postcard message destroyed Frege’s life’s goal and work of nearly 30 years and after it became known caused a major crisis for all of mathematics in the first half of the 20th century !

What had happened that a simple and short postcard message could have such a dramatic effect ?

The irony was that Russell admired Frege for the great work he had done to extend classical logic and his Begriffsschrift and clarification he had brought to mathematics and the philosophy of language. While studying Frege’s Begriffsschrift and working on his own opus “Principia Mathematica“, however Russell had found a simple to construct logical contradiction within Frege’s framework.

Russell constructed a set which we will just call “R”. Let R be defined as: the set of all sets, that do not contain themselves as elements.

Even though sounding a bit odd, it’s a legitimate set and easy to define in Frege’s Begriffsschrift. There are plenty of examples of sets that are members of themselves and sets that are not members of themselves. For example: let X be the set of all things that are cars. Then obviously X is not a member of itself because X is a set and not a car. However, if we define Y as the set of all things that are not cars, then Y is a member of itself as it is not a car !

When one now asks the question: “is R an element of itself or not ?”, then trouble occurs. Let’s first assume that R is indeed a member of itself. Then it follows that R cannot be a member of itself as R only contains sets as members, that are NOT members of themselves ! However, now, if one then assumes the opposite that R is not a member of itself, then simply by definition R must belong to itself as it is defined as just the set of all sets that are not members of themselves ! So, whatever we assume, we come up with the opposite and hence a contradiction.

Such contradictions or paradoxes, as they are often called as well, were nothing new. The Greek philosophers knew some of them already like the Cretan liar paradox (when a Cretan says “all Cretans are liars” can he be right ?). Russell himself gave an example in plain English with no recourse to mathematics called the barber paradox. Assume there is a village with lots of men that need to be shaved regularly. Let’s assume there is only one barber in the village who has a license to shave. Now, according to the license the barber must shave all and only the men that do not shave themselves.

What happens to the barber when he needs a shave himself while sticking exactly to the terms of his license? Is the sentence: “the barber shaves himself” true or false ? If it is true, then the barber shaves himself, but that cannot be true as he can only shave men that do not shave themselves. If it is false, then the barber does not shave himself, but then he must shave himself as his duty is to shave all the men that do not shave themselves ! Similar paradoxes can easily be constructed in natural language but used to be seen as oddities and irrelevant brain teasers only of relevance for philosophers.

However, that is not the case when such contradictions and paradoxes occur in logic or mathematics ! Any contradiction like the one Russell has found has catastrophic consequences for any classical logical and mathematical system and theory. That is because it can easily be shown that if indeed a mathematical theory contains even a single contradiction or allows the proof of even a single false statement, then any other statement can be proven as well, no matter whether it is true or false ! This makes the whole system useless at once, as it cannot differentiate anymore between true and false statements and theorems of the theory.

Frege could not figure out what went wrong here despite all the decades long and hard thinking and progress he had made with his Begriffsschrift. The key goal of all his work was to make sure that something like this could never happen in his framework as otherwise what would be the purpose of all the effort and rigour if it cannot even prevent such simple contradictions to occur !

Russell however, wasn’t so shocked and paralysed as Frege was and went right into trying to solve the problem. He soon figured out what went wrong within Frege’s system – so he believed. He noticed correctly that in Frege’s system any well defined predicate P gave rise to a set of all things x that fulfil the predicate P(x) (Frege’s extension of the predicate). Let’s say P is the predicate “being a prime number” and x any natural number, then P(x) is true if and only if x is a prime number.

Now, the role of x here as a variable to which the predicate is applied is crucial in Russell’s view and needs to be defined exactly and without any ambiguities. If x can take on only numbers as values, then everything is fine and well defined. However, if x can take on also other values of different object type other than numbers then P(x) may not just be false but should be considered as simply undefined or utter nonsense in such cases.

Say x denotes cars as well as numbers, then P(x), with P defining prime numbers, does not make sense when x denotes a car (say a certain BMW) because cars cannot be prime numbers, so P(BMW) is neither true nor false but an inappropriate use of the variable and predicate. The variable x must be considered as part of another logical domain when it denotes cars rather than when x denotes say the number 7.

As a consequence Russell came up with his famous theory of types for variables and his theory of descriptions for predicates that avoided the construction of sets that are members of themselves by defining increasing logical levels of variables and predicates so that predicates could only be applied to variables of certain lower levels but not to themselves. Russell wrote some influential philosophical papers about these topics to justify his approach. He also worked closely with the young Ludwig Wittgenstein on these topics who supported his approach at first, however became a strong critic of Russell’s and similar views of mathematics and logic in his later years.

Russell worked out this theory over about 10 years and published it in his monumental and highly appreciated 3 volume masterpiece book Principia Mathematica between 1910 and 1913 in which he successfully defined and derived the basic laws of arithmetics (similar to Frege) but within his logical theory of types. There have not been found any contradictions within his theory of types and his Principia Mathematica until today.

David Hilbert – Save the Infinite and the Hilbert Program

Russell’s Principa Mathematica was highly regarded among the mathematicians and logicians of his time and a kind of relief for the foundation efforts, but it still did not really solve the key issue of avoiding contradictions. Even if there were no contradictions ever found in the Principia Mathematica that were similar to the one he constructed, that does not give any guarantee that there are no other, different types of contradictions hidden somewhere that could suddenly pop up discovered by some bright mind – just like Russell’s own paradox did. There was no proof that Russell’s theory was free of contradictions itself – and as long as there was no such proof, anything could still happen at any time in the future.

To avoid such a situation once and for all while still keeping Cantor’s universe of trans-finite sets untouched, another influential German mathematician, David Hilbert, stepped up to the plate and got involved. Hilbert was world renowned already at the time as he had first axiomatised classical geometry and build the basis of non Euclidean geometry (the geometry of non flat surfaces – as it is needed and assumed in Einstein’s general relativity theory for example). Hilbert was a highly respected authority and probably the most influential mathematician in the first 30 years of the 20th century.

Hilbert thought that the situation of the lack of a proven consistent foundation of mathematics was an embarrassment for all mathematicians. He wanted to make sure that mathematicians can define and can rely on an agreed finite set of basic axioms (like for set theory) which would allow to proof any true mathematical statement that logically follows from the basic axioms and which would allow a clear and flawless consistency proof for mathematics in general.

He therefore formulated the so called Hilbert program around 1920 demanding to focus on the following from his fellow mathematicians:

1) provide a finite and complete axiomatisation for mathematics (similar as Hilbert did himself for geometry before) or at least for arithmetics which is the basis for many mathematical theories that can generate all true arithmetical mathematical statements;

2) provide a proof that these axioms are consistent with only finite proof methods (to avoid a vicious circle with infinite sets); consistent here means that there is no mathematical statement “P” known to be false (like 0=1) that could be proven

3) provide an effective finite procedure (algorithm) that can decide whether any given mathematical statement P is true or false (this part of the program was called the “Entscheidungsproblem” or “decision problem” which was resolved by Alan Turing some 15 years later – see below). Hilbert added this requirement in 1928.

Hilbert himself devoted most of his mathematical research after the 1920’s to these demands. He created the so called meta-mathematics along the way. This was an ingenious way to approach the above problems: he considered and investigated mathematical axioms, statements and proofs as mathematical objects themselves.

He looked at a mathematical proof for example as if it was just a finite series of symbols starting with some initial symbols (the axioms) and then transforming them only by admissible well defined rules into other symbols in a finite number of steps – thereby disallowing and eliminating any kind of “intuition” or even any semantic interpretation of the meaning of the symbols used.

The symbols and theorems of mathematics would just be like chess pieces on a chess board and the logical rules like the rules for playing chess. Proving the consistency of a mathematical theory with a given set of axioms would then be similar to proving that given a certain initial position of the chess pieces on the chess board, a certain final position (check mate) could not be reached no matter what moves were used and applied to the pieces on the board. Although Hilbert denied seeing his approach like the above chess example, it is still the best way to describe in non mathematical terms his new approach of meta-mathematics.

Many Logics, many Mathematics ?

Hilbert was kind of forced to set up his program, actually. He wanted to keep and defend set theory and all the beautiful infinities created by Cantor. However, Russell’s paradox and similar paradoxes derived from set theory and the trans-finite methods used by Cantor led to a revolt of some key mathematicians of the time (even one of his own most gifted pupils left his camp). The opponents attempted to create a non classical logic and mathematics that would produce results and theorems contradicting the mathematics and logic of the time. Just a decade before the Second World War the mathematicians entered into a period of “cold war” among themselves about the future of mathematics.

Hilbert’s most verbal and prominent critic was the young Dutch algebraic topologist L.E.J. Brouwer. He founded the so called Intuitionismmathematics and philosophy of mathematics. Brouwer believed that any mathematical object (and proof) had to be constructive and finite. He believed that the natural numbers where basic concepts that could not be further explained or reduced and were given via intuition, pretty much like the infinite sets were to Cantor. The only difference between Brouwer’s intuition and Cantor’s was, that Brouwer referred to the intuition of an “ideal mathematician” rather than to God as Cantor did.

But Brouwer did not believe as Cantor in factual infinite sets – only in sets that could be constructed by accepted finite mathematical procedures from the ground up. Brouwer (picture on Dutch post stamp above) was not shy about his views despite his youth. He had already proved some famous theorems in algebraic topology but over time with the development of his intuitionism he distanced himself later even from his own results and proofs because they were not constructive in the intuitionistic sense.

Brouwer denounced some key logical rules and principles that were accepted since thousands of years. He argued that the paradoxes of Russell and set theory come up because they use some logical rules and dogmas that were intended only to be used for finite objects. He considered them no longer valid when applied to infinite sets. His famous attack was on the logical principle and rule of the so called “excluded 3rd”. This rule means that for any properly formulated mathematical statement P, P must be true or false, there is no 3rd option !

Brouwer said this is not true when applied to certain mathematical statements for which there is no constructive proof or disproof known. Hence for such statements A, the classical tautology “A or non A” is not true. If a statement cannot be proven or disproven then we have no basis to claim that this statement is either true or not true – it is undecided that’s the best we can say, but not true or false !

Mathematical truth for Brouwer is therefore much closer to the concept of proof than in classical mathematics where there is a clear distinction between “truth” and “proof”. As a matter of fact, the whole purpose of Hilbert’s meta-mathematics is to analyse the relation between the truth and the proof concept for a given axiomatic system.

Brouwer also assumed that mathematical statements can change their “truth-status” over time. That is unthinkable in classical mathematics. Once a statement is true, it cannot later get false (as long as it has no reference to time in its terms). But for Brouwer, if and when a so far undecided statement becomes proven or its negation becomes proven, then the statement finally is decided and becomes either true or false. As long as no proof has been found however, the truth status of such statement cannot be decided.

Brouwer’s intuitionistic mathematics and logic had some strong impact on math and logic as it was constructive and seemingly consistent. Intuitionism had its high time in the 1920’s and 1930’s but did not really find many followers at the end because the adoption drastically reduced the mathematical and logical methods one could use. That was not in the interest of most mathematicians. Today intuitionist and constructivist methods are seeing some sort of revival due increasing and massive role of computer technology which inherently has to be finite and constructive (but still usually makes use of the excluded 3rd logic principle).

Goedel’s Incompleteness Theorem – The Limits of Math and Logic

Now, we finally get to Goedel and his Incompleteness Theorem. Geodel was no friend of intuitionism. He was a strong believer in Platonism like Frege. For him, doing mathematics was not creating something in ones mind (like Brouwer thought) but rather like discovering something – eternal truths. Mathematical objects existed for him independent of humans. We could grasp mathematical truths with our minds and discover proofs but we do not create these truths or any mathematical objects. Hence, he also had no problem of dealing with completed infinite sets. He wanted to support Hilbert’s program by trying to prove some of the goals Hilbert had previously posted and demanded for mathematics.

However, this unintentionally backfired. Goedel’s Incompleteness Theorem actually proved that Hilbert’s goals were impossible to achieve ! It changed our understanding of axiomatic mathematical and logical systems for ever.

Without going too much into the technical details and exact logical formulations, this is essentially what Goedel’s Incompleteness Theorem says (it’s actually 2 theorems):

1) Incompleteness Theorem 1: Any formal axiomatic mathematical system expressive enough to formalise the axioms and basic properties of the natural numbers (arithmetic) will be incomplete in the sense that there will always be true sentences in this system that cannot be proven within the system (meaning, only with the axioms and rules allowed within the system)

2) Incompleteness Theorem 2: The consistency of any such system cannot be proven within the system itself !

The more important theorem is obviously theorem 2 as it talks about the impossibility of consistency proofs, a result that nobody before Goedel had ever imagined to be true.

However, let’s first quickly discuss theorem 1 and its consequences before we try to understand theorem 2. Theorem 2 is a consequence of theorem 1.

Theorem 1 says that, no matter how well we define our axioms and methods in a mathematical/logical theory, if the systems allows to define and describe the natural numbers and their properties (arithmetics) there will always be unprovable but still true statements that can be formulated in the system. The theorem hence says essentially that in a consistent and decently descriptive formal system of arithmetics the subsets of all possible statements formalisable within the system that define provable and true statements are not identical.

A formal system or theory is said to be complete if a) only true sentences can be derived and b) all sentences that are true in the system can actually also be proven. a) secures consistency of the axioms and the methods that allow deriving new statements from the axioms, this is a must, otherwise the whole system is useless. b) is the real completeness requirement. If b) is not fulfilled then the underlying axiom system does not yield all the true facts and hence would not allow us to have a “complete” picture of all the things the theory describes and all true statements that follow from the axioms.

So, Goedel’s theorem 1 already deals a major blow to Hilbert’s program. It essentially says that we will never find a complete finite set of axioms and rules for mathematics (actually, not even for small parts of mathematics like basic arithmetics) ! However precise we set up an axiom system for mathematics, there will always be some true mathematical statements that this axiom system cannot yield and we cannot prove. In a sense theorem 1 says: there is no solid and complete foundation of mathematics possible.

In hindsight, the consequences of Goedel’s first theorem weren’t so bad and even opened up several new fields of mathematics like: non-standard analysis and arithmetics, proof theory, model theory and higher level set theory. It turns out that many mathematical statements can be found and constructed that are independent of the classical sets of axioms (like the Peano axioms for the natural numbers or the Zermelo-Frenkel axioms of set theory).

A statement P is independent of an axiom set if neither P nor its negation non P can be derived from the axioms. Important statements of set theory like the axiom of choice or the continuum hypotheses are independent of the classical set theoretical axioms (also proven by Goedel later).

This gives rise to competing axiomatic mathematical theories with different sets of provable theorems – or one could simply say: different mathematics. This is because if an independent statement P (axiom) is found, then P (or non P for that matter) can be added to the basic axioms without causing any contradictions, therefore allowing different axiomatic systems for math.

That was already a big shock to Hilbert and the whole mathematical world. But Goedel’s theorem 2 made it even much worse. Theorem 2 is a special case of theorem 1 and says that an absolute consistency proof for mathematics is impossible. Never in the history of science has a more profound statement been made – and even been proven!

Theorem 2 came as a big surprise because it seemed like mathematics stood on solid grounds again after Russell had published his Principia and no further inconsistencies had been found in his system even until today. Russell and the whole world thought that with the construction of logical types of variables and predicates inconsistencies were made impossible and a thing of the past. But Russell – like Frege before him – had overseen an essential detail: the natural numbers.

The natural numbers 1, 2, 3, 4, 5, 6…and so on, seem so “innocent” and “natural” and so familiar to us, that we call them the natural numbers. They don’t seem to deserve any justification. However, it turns out, they are not as “innocent” as they seem at first. Goedel’s most genius insight and key part of his proof was to discover and realise that one can actually use the natural numbers for the encryption/encoding of any finite axiomatic theory and any finite number of series of statements about numbers – hence even to encode all possible mathematical proofs !

Consequently, in his proof of theorem 2, Goedel constructed a function G that encrypted any statement and possible proofs of statements as a specific (usually very large) natural number (called the Goedel number). A proof within the system is defined as a certain finite sequence of statements of the system that begins with some axioms and ends with the statement itself and any two statements following each other in the sequence had to be generated only by the allowed rules of the system.

These natural numbers and their Goedel numbers had now 2 possible interpretations: first, they could be seen as just normal albeit large natural numbers and as the objects of the usual underlying mathematical theory and axioms (of arithmetics). On the other side, on a meta level, every such number could also now be read and understood as representing and encoding a statement about natural numbers or even an encoded proof!

Using a very clever variation of Russell’s paradox and the 2 parallel possible interpretations of the natural numbers, Goedel was then able to construct self referring statements within the system and encoding them again as natural numbers.

He defined a specific relation between natural numbers n and m, Proof(n,m), that is true if and only if n encodes a proof of the statement that is encoded by m. He then used Proof(n,m) to construct a self referring statement “S” that says on the meta level: “I am not provable”. Goedel then proved that neither S nor non S are provable within the system. The argument goes like this: First we assume that the underlying axiomatic system for which we want to proof the consistency is actually consistent. For, if it were not consistent, then we could proof anything. So, we assume consistency and now want to prove, that this is indeed the case.

So, let’s first assume that S is provable within the system, then S would also need to be true (because the system is assumed to be consistent, no false statements can be proven), but then it would be true that S is not provable as this is what S says about itself. Hence, the assumption that S is provable leads to a false statement, which means the assumption that S is provable cannot be true.

Now, let’s see if non S, the negation of S, could be provable. Let’s assume non S is indeed provable. Since the underlying system is consistent, then S cannot be provable as well. However, since S says “I am not provable”, it then is true. But now we would have both true S and non S, which again cannot be as we have assumed that the underlying system is consistent and hence not both statements can be true.

In summary, S (and non S) are not provable within the system when we assume its consistency. However, S expresses the consistency of the underlying system because it says that S itself is not provable. If we have even jus one single sentence in the underlying system that is indeed not provable, then the system MUST be consistent as it means that not every statement is provable. Hence, the unprovability of S is equivalent with the unprovability of the consistency of the underlying system.

For a more exact and formal sketch outline of Goedel’s proof that does not need a lot of mathematical knowledge, please see: https://en.wikipedia.org/wiki/Proof_sketch_for_Gödel%27s_first_incompleteness_theorem

With these astonishing theorems Hilbert’s program was practically dead, something that Goedel had not intended.

Goedel’s Incompleteness theorems, however did not say that consistency proofs are impossible, but they showed for such a proof one would always need to use additional unproven axioms or non admissible methods from outside of the system – and that would defeat the whole purpose of a consistency proof and would be a kind of cheating.

Pretty much at the same time as Goedel was working on his theorems, the Polish mathematician Alfred Tarski worked on a similar problem and had similar results. He showed that in any decently expressive formal system that allows the definition of the natural numbers that is consistent, the concept of “truth” could not be defined without creating contradictions within the system. That was another major blow to Hilbert’s program and goals. How could mathematics even be possible without being able to define the concept of “truth” within its axiomatic framework ?

Alan Turing – what is an Algorithm, what is Intelligence ?

The final problem posed by Hilbert in 1928 and still unsolved after Goedel’s results and Hilbert’s last hope was the so called Entscheidungsproblem: can there be an algorithm that decides of any given mathematical statement whether it is true or not in a finite number of steps ? After all, the Incompleteness theorems – as devastating as they were -did not exclude the possibility of the existence of such a decision procedure.

However, as you may guess by now, the answer to the Entscheidungsproblem is also negative. There can be no such algorithm and the person who proved this was the young and brilliant British mathematician Alan Turing (on left), who is today widely considered as one of the fathers of AI.

Turing was fascinated by Goedel’s work and knew about Hilbert’s program and his quest to find a decision algorithm. However, knowing Goedel’s theorems it was straight forward for Turing to prove that there cannot be a finite algorithm that correctly checks mathematical statements for their truth.

To prove the Entscheidungsproblem and to define what an algorithm actually is, he first developed his famous and conceptually very simple model of a general purpose computer – the so called (universal) Turing Machine, named after him. Instead of using complex mathematical proof techniques like Goedel or Alonso Church, who at the same time and independently also solved the Entscheidungsproblem, Turing used his much simpler and intuitive Turing machine model to solve the problem.

Turing first solved another problem the so called Halting Problem for Turing machines. The Halting Problem is the problem to decide, given any program simulated on a Turing machine and an input to it, whether the program will stop after finite steps at the end or not (disregarding time and memory constraints).

Using a similar logic as Goedel did in his Incompleteness theorem, Turing proved that if any such algorithm existed and were to be run on a Turing machine, the machine could be shown to run forever and halt at the same time which is a contradiction and hence the assumption could not be true. Using this negative result for the Halting problem, Turing then deduced that there also cannot be an algorithm for the more specific Entscheidungsproblem for mathematical theorems.

Other than the very theoretical results of Goedel, the Turing machine and Turing’s results have very practical every day consequences for computer scientists and programmers. For example, it’s an easy consequence of the unsolvability of the Halting and Entscheidungsproblem that there also cannot be a program in any of the common high level programming languages used today that could check whether any given program will get stuck in an endless loop (meaning it crashes) or not when run with certain inputs. That means, there is no universal program checker or debugger. So, don’t waste your time trying to develop such a program. But that does not mean that Microsoft and the other big software firms shouldn’t keep working on improving their products.

How intelligent can Computers be ? In Defence of the Turing Test

Turing understood the strong limitations Goedel’s theorems imposed on what we can do and achieve with computers and algorithms. He himself proved that computers cannot solve the Halting problem (and today we know of many other problems, that cannot be computed). So he asked himself the question, how intelligent can computers eventually become ? Are real intelligent computers possible ? He even got interested in the philosophy of mathematics and logic about these subjects and visited and discussed his views intensely with the philosopher Ludwig Wittgenstein, who had a very different opinion than Turing about all these issues.

Contrary to his fellow mathematicians, Turing was very familiar and even professionally involved in the development of the first major programmable computers. It is well known that Turing was part of the efforts cracking the Enigma codes of the German Nazis in World War II at Bletchley Park using and developing one of the first large scale computers. So, he had a much deeper and better understanding of early computer technology than Russell, Goedel, Hilbert and most of the mathematicians and logicians of his time.

It is therefore not surprising that Turing was fascinated about the early computers and what they could do and quite optimistic about the levels of intelligence machines can achieve despite the known theoretical limitations stemming from Goedel’s results. He even developed one of the first chess programs to see how well computers can perform on such tasks and whether they could reach human level skills. Turing asked himself, how intelligent can even potentially much more powerful computers of the future get.

As a result he published a famous paper titled “Computing Machinery and Intelligence” in 1950. In this paper he introduced the infamous Turing Test.

Turing argued that the question whether computers will ever be able to “think” or become “intelligent” could not be answered because the terms “thinking” and “intelligence” are not clearly defined and will remain hard to define going forward. To nevertheless somehow get a grip on how to judge the intelligence of machines and to have at least some agreeable practical criteria, Turing suggested his Turing test, a kind of “computer intelligence” test.

He suggested a machine or program should be considered “intelligent” if it could at least mimic and emulate well the natural language understanding and communication skills of humans at a level that makes it practically impossible to distinguish between humans and machines just by judging their respective language skills.

Turing chose a language test because, even though it is quite limited in terms of the intelligence it would measure, it still would need a lot of skills for a machine or program to ever master such as test as the program/machine would have to: “understand” language to a very high degree and comparable to humans, would need to be able to generate language in a human like quality, would need to have lots of knowledge and especially common sense knowledge about the world to have any type of chit-chat with a human. It would even have to have a “theory of mind” of humans as it would have to somehow be able to trick the human that performs the test into believing that itself is also a human. The machine would need to “understand” and be able to predict what a human would expect of it, it would need to “understand” how to “lie” about itself, pretending to be alive and conscious without being caught. It would need to have contextual understanding, memory, humour, social skills, “understand” politics, basic science, culture and so on and so on.

Although the Turing test proposal as a criterium for “general intelligence” has been criticised from many sides for good and bad reasons, a serious Turing test has still not been successfully passed by any computer or program even today. This shows that the test is actually very hard to pass and somehow matches and supports our intuition that machines today are still far from general human intelligence levels. AI systems can play chess and GO and soon also drive cars better than we humans do, but passing a serious Turing test is still a major milestone for AI waiting to be achieved. It shows, how far sighted Turing was with proposing this test model.

The AI Conference of 1956 – the Birth of AI ?

After WWII the center of research in many scientific areas and in mathematics and logic especially shifted from Central Europe to the USA due to the lack of funds after the war but mostly due to the many researcher immigrants that had left Europe fleeing from the Nazis and moved to America (among them Einstein, Goedel, von Neumann, Fermi and many others). Somebody should explain this to Donald Trump some day.

Most people in the AI community, especially the younger generation, don’t seem to be aware of the high level of pre WWII scientific culture and hence often date the birth of AI back to the famous Dartmouth conference of 1956 where the term “artificial intelligence” was formed. This is still the prevailing US centric view of the emergence of AI today. As I have tried to show here, this view is wrong and could not be more ignorant of the history of AI and computer science in general.

There would be no AI if it were not for the foundational crisis of mathematics and logic at the end of the 19th and the beginning of the 20th century and the tremendous efforts, achievements and genius insights of people like Frege, Russell, Hilbert, Goedel and Turing, to name but a few.

The Dartmouth conference was organised just 2 years after the tragic and early death of Alan Turing in 1954. A group of brilliant people met in Dartmouth to push the agenda of machine intelligence forward and each of them made remarkable and well known contributions to AI and some to science in general (like Minsky and Shannon) in the decades after and during the second half of the 20th century and got us to where we stand today in AI. The conference was an important milestone in the development of AI, no doubt, but it should not be considered its birth date – like our world culture should also not be considered to have started with the discovery of America.

Since the progress and the wide range of results in AI that have been achieved after this conference are very well known and documented, there is no need for me to repeat it here and I conclude my little historic journey into the beginnings and the emergence of AI.


Even after more than 75 years the relevance and the impact of Goedel’s theorems are still topics of discussions in AI circles and in the philosophy of science and mathematics today. Some scientists argue that Goedel’s theorems show principle unsurmountable hurdles for current machines to ever become really intelligent (or conscious for that matter) and that the theorems dealt a deadly blow to all efforts to build for example AGI (artificial general intelligence) and human level intelligence systems in AI.

I would rather draw another conclusion from Goedel’s theorems: the impact of his theorems can still be felt by the current dominance of mathematics in AI models and most AI approaches. However, all intelligence we know so far emerges only in biological, living systems, not in machines that are based on mathematical, logical or engineering principles like all the modern AI models are.

In hindsight, the theorems show us that we may have taken a wrong path to AI. The way forward in AI therefore rather lies in what I would call the “biology turn”.

The key question for AI in the 21st century therefore should be: how do biological systems achieve intelligence ? (see also my recent post: Cracking the Neural Code).

If current mathematics and computational logic cannot lead us to human level (artificial) intelligence and machine consciousness as it seems, then other sciences like systems-biology, systems-chemistry, computational biology, bio-physics, modern (theoretical) cell biology and modern cognitive (neuro) science should take over the lead instead.


Hong Kong, May 8, 2017



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: