# “Singularity” is improperly used in AI

As I believe that artificial systems can do whatever we are able to do, developing AI systems able to perform AI research is particularly interesting. If it is realized, the performances of such a system will increase without our help: one day, we will have an artificial AI researcher which will advance AI much better than ourselves. I am working since many years on such a system: CAIA (Chercheur Artificiel en Intelligence Artificielle). If we succeed, AI will improve much more efficiently than when we are painfully trying to develop it. We usually call this step the “singularity”. Sometimes, I used this word, but it was a mistake because it implies two false ideas: this transition will lead to an abrupt change of the performances, and it will cause horrendous consequences.

The word “singularity” is often used in mathematics. It can refer to a point where a mathematical object does not behave; there is also a theory of the singularity, one of its branches is catastrophe theory, where sudden and dramatic changes appear in the behavior of a system. Such a word suggests that a disaster will happen at this step. I think that both aspects, suddenness and disaster, will not be present when we will have AI systems developing AI much better than ourselves.

I begin with the suddenness. It is not new that AI systems have performances better than us in some domains, and this did not happen quickly. Let us consider Chess programs, the first work in this area was made in the 1950s. These programs played very badly: human players called a silly move “a computer move”. Then the programs have been improved, reaching the level of a good player. In 1983, a system developed by Ken Thompson shown that two Bishops always win against one Knight; before that, all the Chess players believed that it was a draw. In 1997, Deep Blue won a match against the world champion, Garry Kasparov. However, Kasparov was not crushed: he lost two games, but won one, and there were three draws. For the time being, the Elo of the best program is 3450, when it is only 2880 for the world champion. With such a difference, the probability of winning a game for the program is 98%; therefore, a human chess player does not play against the best programs if they are not restricted.

All in all, thirty years were necessary in order to be confident of the superiority of Chess programs. Nevertheless, it is much easier to develop a Chess program than a program developing AI! Moreover, for evaluating this kind of system, one must consider its performances in all the domains where AI can be used: game playing, theorem proving, medicine, automatic translation, and so on. We will have to find the level of performances of the general AI system for each of them! This progression will take many dozens of years, there will not be a special time where they become better than ourselves in any domain. Even for a particular domain, as for Chess, the period of time where AI systems performances will become better than ours, will be staggered over several years. The situation is much more complex than for Chess: instead of finding for one domain, whether the AI system performances are better than those of human beings, we must find whether the performances of an Artificial AI researcher are better in any domain than those of human AI researchers. To do that, one must compare the results obtained in many domains. For a long period, the artificial researcher will be better on some domains, worse on several domains, and of equal strength in the other cases. We have not a single general algorithm whose performances will regularly increase with the time, but a huge amount of data, and many programs which create other programs. The situation is very difficult to assess.

For a rather long time, it will be difficult to decide whether artificial systems are better than human beings. However, we will have some useful results found by the artificial ones. This will last until some day it will be clear that artificial researchers in AI are better than ourselves, as it is now the case for Chess. It is possible that this day will never occur because we, humans, are not clever enough to develop such AI systems. It is evident, that the suddenness of the mathematical singularity will not happen.

Let us consider now the other aspect: when AI will be much better than ourselves, a disaster occurs. I have already considered this problem in another blog; I believe this is based on a too restrictive idea of intelligence. On Earth, and also on zillion of planets, an intelligent life may appear. It will be created by the evolution, a very efficient method when a huge number of individuals interact for a huge number of years. This competition results in very aggressive beings: if there are aliens, we must be extremely cautious. Their kind of intelligence would be so unlike our own that we could not communicate. If they discover our planet, they will have no hesitation in destroying us, just as we do when we are wiping out an anthill.

However, for AI systems, we must not take an approach similar to evolution: it would require too much time. Personally, I am trying to bootstrap AI: what already exists helps me to improve the system. I am systematically considering a module, and replace it by a new module that will create, among other things, something similar to the initial module. This leads to modules that can improve parts of themselves. In the same way, at present, our computers are designed with the help of computers: if they did not exist, we would be unable to conceive them.

Naturally, this may produce good and bad results. It depends on what we do with it; sometimes, computers have led to questionable consequences. However, the future of AI will certainly not be what is implied by using the word “singularity”: There will not be a discontinuity, this will happen over a considerable period of time, and a disaster will not necessarily occur.

# CAIA as a mathematician (Part 3)

All in all, the explanation given for CAIA’s solution of the Saint-Exupéry problem is very simple. It includes only two important choices: what disjunction for the backtrack, and what value for L. For all the other steps of its proof, CAIA uses a combinatorial method: it applies every of its mathematical rules that can be executed, and it examines the result. Either it removes this result because it does not seem interesting (and this step never appears in the explanation), or it keeps it for later use with its mathematical rules (and it appears in the explanation only if it is used). For the simplest kind of meta-explanation, it is sufficient to indicate the deduction rules appearing in the explanation: their conditions indicate why they could be executed.

For instance, in the proof given in Part 1, one applies to constraint [3] a rule that leads to the disjunction [22]. To do that, CAIA seeks information on the possible power of a prime N in the factorization for every variable: V is not a multiple of N, or V may be a multiple of N, of N2, but not of N3, and so on. P, which is prime, is in our problem either odd or a multiple of 2, but not of 4. It considers the combinations of these values for all the variables of the constraint that do not give a false result. It tries it for N=2 (which gives [22]), 3, and 5. The conditions of this rule indicate that is applied to equality constraints with at most 3 variables and a degree greater than one. For N=3 or 5, this gives no useful result for the Saint-Exupéry’s problem. Therefore it does not appear in the explanation.

The same rule will be also applied with N=2 to constraint [2], which satisfies the necessary conditions. In the various branches, it shows that P=2, in [34], [70], and [101].

The generation of disjunctions, and the choice of one disjunction when it decides to backtrack are important for the success and the simplicity of the proof. However, it often has consequences only on the quality of the proof, rather than on the existence of the proof. For Saint-Exupéry, CAIA found twelve disjunctions before backtracking. When CAIA chooses any of them, it always finds the solutions. The difference is in the size of the tree: with the chosen disjunction, the tree has 4 leaves. It has 11 leaves for the worst case, when it uses the following disjunction:

P≡4 (mod 5) OR P≡3 (mod 5) OR P≡2 (mod 5) OR P≡1 (mod 5) OR P=5

We can notice that another aspect could be added to meta-explanations: why one has found the result quickly? Many derivations can be made, and it is better to make first those that are most likely to succeed. CAIA knows why it has given a high or a low priority to a deduction, but I did not include its reasons in the present meta-explanation: this would complicate it too much, and it would not be easy to give CAIA the ability to create such a kind of meta-explanation.

One could also add another possibility to a meta-explanation: present some derivations that have been made, that it was reasonable to try and which fail. For other problems, this could lead to the solution. One would have Why not? meta-explanations. It could be good for the students, encouraging them to add such method to their tools, and helping them to define more accurately when it is useful.

Therefore, it is possible to associate to each step of a solution a meta-explanation that indicates why one has considered this step. This is essential for teaching mathematics: the student must learn how one can find a proof. Unfortunately, this step is often missing in the teaching of mathematics. As a result, many students believe that one must have a gift to do mathematics. As they believe that they are not gifted, they do not even try to find the solution.

However, it is a difficult task for a mathematician to indicate why they have found a proof, such as Poincaré, who had an illumination while he boarded a train for Coutances, but he did not know why. The unconscious plays a significant role in the discovery, and we do not know why we have considered the key of a proof.

AI systems may have a huge advantage on us. Certainly, as for the Saint-Exupéry problem, it does not always find a solution as subtle as the one found by a mathematician. However, this solution may be easier to understand, and one can meta-explain it more easily. With little effort, it would be possible that CAIA creates the meta-explanations by itself.

AI systems have a huge advantage over ourselves: one can manage to design them so that there are conscious of the reasons of their acts far better than ourselves. Consciousness is an important reason why we are more clever than animals; therefore, meta-explanation is an extraordinary source of improvement for future AI systems.

However, solving problems is only a part of the activities of a mathematician. For the present time, CAIA is not able to do most of them for instance, receiving the description of a new theory, and trying to find interesting results in this theory. I began working on this problem 60 years ago, for my thesis. For the present time, CAIA cannot do it.

# CAIA as a mathematician (Part 2)

Let us consider the beginning of the proof given in Diophante site. Three new unknowns appear, defined by:

M/N=(C-A)/B=B/(C+A) the parities of M and N being different. This gives:

A/(M2-N2)=B/(2*M*N)=C/(M2+N2)=D

D is an integer because two denominators have never a common factor.

Therefore, one has to solve:

2*M*N*(N-M)*(N+M)*D2=311850*P

As either M or N is even, P is an even prime, therefore P=2:

M*N*(N-M)*(N+M)*D2=2*34*52*7*11, thus D is a divisor of 32*5=45.

The end of the proof is given in the site. After several skilled deductions, and some local backtracks, both solutions are found. Clearly, both proofs are completely different.

This is an excellent proof. The value of P is found without backtracking; the mathematician who discovered it is very clever. However, I had some difficulties to understand it, and more importantly, I cannot see how I could have found it. On the contrary, I easily understood CAIA’s proof, and I see how it could be possible to generate a meta-explanation, which indicates how it can be found. For the present time, CAIA can automatically create explanations, but it cannot yet create meta-explanations: I have more important problems to solve for bootstrapping AI, and I cannot spend my time on a problem that will not help to advance the bootstrap.

However, I will indicate some elements of a meta-explanation, easily taken from CAIA’s explanation. When it solves a problem, CAIA may have to choose one of the three following actions:

To decide to backtrack, and to choose a disjunction.

To choose a deduction that will enable it to create a new constraint from the already known constraints.

To normalize a new constraint. To do that it applies the simplification rules. For instance, it replaces by 0 a product where one of the factors has a zero value. No meta-explanation is given for this step: the simplification rules are always applied to a new constraint.

CAIA backtracks when no deduction is in its to-do list. Then, for choosing a disjunction, it selects the most interesting disjunction. The interest of a disjunction depends on the number of its choices, and the interest of its less interesting element: an equality with one unknown is better than one with several unknowns, an equality is better than a congruence, when there is an AND inside the ORs, creating several new constraints is better than creating only one, etc.

During the initial phase of the Saint-Exupéry problem, CAIA does not find the value of an unknown, although it tried all the possible deductions. At that time, it had generated 12 disjunctions. One of them considers the parity of two unknowns, another one, [22], defines the parity of three unknowns; therefore, it prefers the second disjunction. Most of the other disjunctions are not so interesting, for instance: A is a multiple of 11 OR B is a multiple of 11, which add a congruence for only one unknown. However, another promising candidate was: P is odd OR P=2. It was not chosen, but it would have given a completely different proof.

When a rule has been executed, the conditions of this rule indicate why it has been chosen. Naturally, many constraints have been created, which are not used in the explanation. In that situation, I add nothing to the meta-explanation.

Let us consider some steps of the explanation given by CAIA, described in the preceding blog.

After creating [39] where a product is equal to a constant, one considers the rule that creates a disjunction indicating all the possible correspondences between the elements of the product and the factors of the constant. This is also used after [73] and [104]. No meta-explanation is necessary: when the conditions of this rule are satisfied, one executes it.

When a constraint such as [48], [82], or [115] is a disjunction of the values of N unknowns, and if another constraint gives the value of another unknown from the values of some of the N unknowns, one creates a new disjunction which defines the values of N+1 unknowns. Naturally, one checks that this element satisfies the other constraints of the problem that may exist on these N+1 unknowns, for instance B<=A. For the first two cases, one has a contradiction: for every element, there is no possible value for the new unknown. For [115], this gives the two solutions. In this case, there is no meta-explanation to give: one considers all the disjunctions satisfying the conditions, and for each one, one applies successively all the constraints that could define the value of a new unknown.

The use of the rule leading to the proof of [62] and [94] is more difficult to meta-explain, because CAIA has to make a choice. This rule indicates that, if one has a constraint A=B, one also has A ≡ B (mod L) for any integer L greater than 1. The difficulty is that there is an infinite number of possible values for L. In the definition of this rule, an algorithm generates a set of possible values for modulo L. Then, one only tries these modulo, and one keeps a result when it is interesting.

For instance, if an equality constraint includes an unknown V to the power 5, and if V≡ 3 (mod 15), then one considers L=45. The new constraint will be simplified since V5 ≡ 18 (mod 45) when V≡ 3 (mod 15). This table, giving the modulo to consider depending of the power of an unknown and its congruences, was automatically generated. In the present proof, when B is odd, one considers for L the values 2, 4, and 8. With L=8, and when B and C are odd, we have B2≡ 1 (mod 8), and C2≡ 1 (mod 8); after simplification, A2+B2≡C2 (mod 8) becomes A2≡0 (mod 8), simplified into A≡0 (mod 4).

In the next blog, I will add more comments on meta-explanations, based on CAIA’s solution for the Saint-Exupéry problem.

# CAIA as a mathematician (Part 1)

It is interesting, to compare the solutions found by CAIA with those found by a human mathematician. Diophante is a very useful site, which proposes many mathematical problems, with one, and sometimes several, solution. I will consider the solution found by CAIA, and the one given by Diophante for the same problem. One finds it by clicking in sequence on « problèmes par thème », then « A. Arithmétique et algèbre », « A1. Pot pourri », and finally « A10168. Le défi de Saint-Ex ». This problem was found by Antoine de Saint-Exupéry a few days before he was shot down with his plane during ww2. Rated 3 out on 5 by the site, it is of medium difficulty.

CAIA has to solve the following problem: Find the value of four unknowns A, B, C, and P, which are positive integers satisfying 3 constraints:

[1] P is prime

[2] A*B=311850*P

[3] A2 + B2 = C2

CAIA begins with looking for symmetries, that are permutations of the unknowns that lead to the same set of constraints. This meta-problem, finding symmetries, has been defined for CAIA; it solves it using the same methods as for the other problems. It finds, very easily, that there is only one symmetry that transforms A, B, C, P into B, A, C, P. As it does not want to generate symmetrical solutions, CAIA adds the constraint:

[4] B<=A

CAIA performs many deductions for solving a problem. When it is completed, it extracts the deductions that are necessary for justifying it: it is the explanation. I will now give the explanation found by CAIA for the Saint-Exupéry problem. Before each new constraint, I indicate, between brackets, their number as it is given by CAIA. The missing numbers were for constraints that are not necessary for the explanation. The first useful constraint that it generates is:

[22] [A even AND B even AND C even] OR [A even AND B odd AND C odd] OR [A odd AND B even AND C odd]

Then, it decides to backtrack, successively considering the three possibilities of [22].

No1 case. A even AND B even AND C even

In [2], A and B are even, A*B is at least a multiple of 4, 311850 is not a multiple de 4. Therefore, P is even; the only even prime is 2.

[34] P=2

[39] A*B=623700

[48] [A=3850 AND B=162] OR [A=2310 AND B= 270] OR [A=1650 AND B= 378] OR ………..]

There are 30 possibilities in this disjunction. CAIA finds them from the factors of 623700, keeping only the values that satisfy the three conditions: A and B are even, and B<=A.

From [48], CAIA creates a new constraint where, for each couple of values for A and B, it adds the value of C obtained from [3]. If C2 is not a square, there is nothing for it in this new constraint. In this case, this happens for the 30 couples of values for A and B. Therefore, the new constraint is FALSE, there is a contradiction.

No2 case. A even AND B odd AND C odd

From [3], CAIA generates the constraint A2 + B2 ≡ C2 (mod 8). CAIA simplifies it, first using that if x is odd, then x2 ≡ 1 (mod 8). The constraint becomes A2 ≡ 0 (mod 8), which becomes:

[62] A ≡ 0 (mod 4)

Then, A*B is a multiple 4, so:

[70] P=2

[73] A*B=623700

[82] [A=623700 AND B=1] OR [ A=1540 AND B=405] OR [A=1100 AND B=567] OR …..]

36 possibilities are in this disjunction, coming from the factors of 623700, checking for each one A even, B odd and B<=A. For each couple, one determines, if possible, the value of C, using [3]. Here again, the value for C2 is never a square. Therefore, the new constraint is FALSE: there is a contradiction.

No3 case. A odd AND B even AND C odd

CAIA still considers A2 + B2 ≡ C2 (mod 8) which becomes this time:

[94] B ≡ 0 (mod 4).

Following the same steps that in the preceding case, CAIA finds:

[101] P=2

[104] A*B=623700

[113] [A=1925 AND B=324] OR [A=1155 AND B=540] OR [A=825 AND B=756] OR …..]

24 couples are in this disjunction. Computing C for each couple, one has only:

[115] [A=825 AND B=756 AND C=1119] OR [A=1155 AND B=540 AND C=1275]

This gives the two basic solutions for this problem. There are also two symmetrical solutions, where one swaps the values of A and B.

In the next blog, I will compare a human method for solving this problem with CAIA’s one.

# Thinking, Fast and Slow

At least two winners of Nobel prize in Economics have made important contributions to the understanding of human intelligence. Naturally, Herbert Simon is one of them; a little later, Daniel Kahneman has written Thinking, Fast and Slow where he shows the coexistence of two systems in our brain.

System 1 is fast, and it operates automatically without a voluntary control: it jumps to the result. Unfortunately, in return for this immediate response, its answer may be wrong. Moreover, we cannot justify our solution. We have already seen an example of a result found by our system 1: the question was about a beverage and an animal. We had often used a beverage linked to this animal, hence the mistake. System 1 is very efficient in the situations where there are many examples, and where one can evaluate the results accurately. In that case, all the conditions are met so that the learning goes well.

When the situation is not favorable to learning, we do not see our mistakes, and we often accept them, although they contradict the laws of logic. The author describes an experiment where Linda is a thirty years-old single woman, outspoken and very bright. She majored in philosophy, was deeply concerned with issues of discrimination, and participated in antinuclear demonstrations. Then, he asks which alternative is the more probable:

Linda is a bank teller.

Linda is a bank teller and is active in the feminist movement.

More than 85% of the students chose the second option, although this choice is obviously contrary to the mathematical laws. Many students are convinced that their answer is the good one: when, very angry, the author told his students that they had violated a logical rule, one of them shouted “So what!” Anyway, we can understand the answer of those students: from the description of Linda, their system 1 immediately tells them that she is a feminist. Naturally, they chose the alternative that mentions she is a feminist.

The fast system often gives excellent results. However, one must have seen a huge number of examples before competency can be obtained. Therefore, an expert may have excellent performances. We are using system 1 when we are driving a car, when we are playing speed chess, when a man watches a woman (and vice versa), etc. When we have a result, we are convinced that it is right, although we cannot explain it. We often call this mechanism “intuition”.

With the slow system 2, introspective consciousness allow us to know a very little part of what happens in our brain, and to use it. The tasks where we must keep some intermediary results in our working memory are also performed with system 2. For instance, we are using it for doing products in our head, such as 47×28. When the fast system is in a situation where it cannot give an answer, because it is a new situation, or when it knows that it is not good for some kind of problem, it turns on the slow system. Unfortunately, it does not always start its colleague, even when it would be necessary.

It is interesting to compare the operation of our brain with an AI system. Neural networks have similarities with the fast system. They have resulted in several recent successes of AI, such as self driving-cars and playing Go. Many examples are necessary for defining a network, and they cannot explain their solution. They are particularly powerful for applications where perception is important, which are also those where we are using system 1.

However, AI systems have also to solve problems where the preceding methods cannot be used because there are not enough examples with a correct evaluation. A method widely used in AI is to develop a tree, which can be done when a finite set of possible actions is known. Far better than us, AI, using fast computers, can develop huge trees, where many positions are considered. Then, one has only to choose sequences of actions that surely lead to a solution. For us, it is a slow method, performed by system 2. However, it should be supplemented by a fast method, which has to evaluate the value of the leaves. For game playing programs, one uses an evaluation function, which has the characteristics of system 1: fast and no explanation. System 1, used by both humans and artificial systems, has been improved: it knows that some situations are correctly evaluated, and that some are not. In that way, when the tree is too large, one can stop the generation of the tree only at correctly evaluated positions. A very important improvement of system 1 would be that, like in this example, it gives a value, but also a value of this value (from totally accurate to very dubious). Unfortunately, humans often inaccurately assess this meta-value; a chapter of the book is about the illusion of validity.

AI systems may also have more possibilities: for instance, they can analyze the formulation of the problem, and build modules that can solve it. I did it fifty years ago for a General Game Playing Program. I had also implemented in a learning system an Explanation Based Learning module that, firstly generated an explanation of what happened in a game. In that way, it was possible to generalize from only one example, and to apply an analogous method in new positions. In both cases, a huge number of situations is no longer necessary, as it is with system 1. We humans are using system 2 for such activities. For us, system 1 and system 2 cooperate, in the same way that weak AI and strong AI must also cooperate.

# Bootstrapping CAIA Part III

For the realization of a General Problem Solving system, a number of meta-problems arise. The main idea of bootstrapping is that these meta-problems will be solved by the system itself, in the same way as it solves the problems for which it was designed. CAIA is quite an elaborate system; however, it is not yet able to solve every type of problem, and particularly many kinds of meta-problems. I gradually increase its capacities so that it can solve a wider range of problems. For instance, a useful feature for solving more mathematical problems has been the introduction of unknowns with an infinite set of possible values.

It turns out that two families of meta-problems can be solved by CAIA, using its current formalism:

To find the symmetries from the study of the formulation of a problem. We have already considered several times why this capacity is interesting.

To add new elements to a particular family of problems. A researcher has to test his/its system with problems at various levels of difficulty. Unfortunately, there are often too few of them, or they are not difficult enough: as they have an entertaining goal, very complex problems, necessary for checking the system, may be absent. Therefore, I have associated to several families of problems, the definition of a meta-problem that creates new problems in this family.

Unfortunately, many meta-problems have a definition completely different from the problems currently solved by CAIA, and also by most of the general systems. Some of the meta-problems encountered when one wants to solve a problem are:

How to cope with a new result: keep it or eliminate it?

Is it better to backtrack immediately, or to wait for a little, hoping to find a useful result?

If one decides to backtrack, which set of choices will be considered?

How to select the next rule and the elements to which it applies?

What is the probability that a particular step will succeed?

What would be the interest of the result of a derivation if it succeeds?

In order to succeed a bootstrap, these meta-problems must also be solved by CAIA. For the present time, it solves them by using methods that I have found, and it can neither create, nor modify them. Therefore, the next step is to give it the capacity to solve meta-problems such those I have just mentioned.

Fortunately, one can use methods similar to those already successful for more usual problems. For instance, one can consider two levels of backtrack. At the lowest level, one successively examines the situation after adding one constraint from the set that contains the various possibilities. At the highest level, one successively examines the sets of constraints that are known at this step of the resolution: a meta-backtrackis a backtrack on the backtracks that can be considered.

Looking again at the Triplets problem with N=12, one quickly finds two possibly interesting backtracks. They are defined by the set of constraints that one has to consider successively:

On the one hand: 12Q+12C=24A or 23A or 22A….or A, 24 choices in all.

On the other hand: (B even and C even and R even, and P even) or (B even and C odd and R odd and P even) or…(B odd and C odd and R odd and P odd), 11 choices in all (5 of the 16 possible choices have been eliminated).

Choosing the best way to backtrack is a meta-problem, which can be solved by opting for the first one (a linear equality is better than a parity constraint) or for the second one ( only 11 branches are better than 24, one unknown is also better than three; moreover, at each step, one adds four new constraints rather than only one). However, it is also possible to meta-backtrack: one considers both, then one focuses its efforts on the one whose results are more promising (in this problem, the equality constraints). This allows to solve the problem successfully, even when the evaluation of these backtracks is unsatisfactory.

Many meta-problems may be solved by a set of judgements, whose synthesis will give a value allowing to rate and rank the candidates. This is often sufficient when some uncertainty is allowed because there are not too many possibilities to consider, and computers are very fast. When the decision is very important such as choosing a particular backtrack, these judgements are sometimes too approximate. Either one tries to improve the quality of these judgements, or one meta-backtracks.

Naturally, there remains another meta-problem: to find the judgements that one has to use for solving each kind of meta-problem. For the preceding example, the basic elements for these judgements were general: it is better that there are not too many branches in a backtrack, it is better to add each time several constraints rather than only one, an equality constraint is more selective than a parity constraint, adding a constraint with one unknown is better than adding one with several unknowns. They allow a pre-selection which will be improved by the meta-backtrack. In this particular case, one is not very far from the end of the ascent of the meta-levels necessary for the successful completion of the bootstrap. Many other kinds of meta-problems have to be solved, but bootstrapping AI is perhaps not as hard as it sounded.

# The Singularity Part III

Overall, it does not seem that these arguments show that the existence of AI systems with a super-human intelligence is impossible. However, I am not totally sure that human beings will some day realize such systems, for two reasons, both depending on the limitations of human intelligence.

```Firstly, have we enough intelligence to succeed? We must create systems that can create systems better than those that we have created. This is very difficult, we have to write rules that write rules, it is far from being obvious. I cannot do it directly: I begin writing something that looks satisfactory. Then, I run it on the computer; usually, it does not work. I improve the initial version, taking into account the observed failures. It is possible that, over the years, we will be better for defining meta-knowledge that creates new meta-knowledge, but it will always be a very difficult activity.

Secondly, the scientific approach is excellent for research in most domains: physics, computer science, and even AI as long as we do not try to bootstrap it. Usually, the reader can observe an improvement of the performances. When one is bootstrapping AI, the progress is not an improvement of the performances, but an increase of the meta-knowledge that the system is capable to generate. Unfortunately, this does not immediately lead to better results. It is difficult for a reader to check this improvement for a system that contains 14,000 rules, such as CAIA.
Moreover, this meta-knowledge has only a transitional interest: it will soon end up tossed into the wastebasket. Indeed, in the next step of the bootstrap, it will be replaced by meta-knowledge generated by a system such as CAIA: its goal is to replace everything I gave to CAIA by meta-knowledge that CAIA has itself created, with a quality at least equal. We must avoid the perfection, we have no time to waste on elements for single use only. The success of a bootstrap can only be assessed at its end, when the system runs itself, without any human intervention: when it has reached the singularity.

To sum up, I think that AI systems much more intelligent than ourselves could exist: there is no reason why human intelligence, which results from evolution, could not be surpassed. However, it is not obvious that our intelligence has reached a level of excellence sufficient to achieve this goal. We need external assistance, and AI systems are the only intelligent beings that can help us; this is why it is necessary to bootstrap AI.
Unfortunately, we are perhaps not enough clever to realize this bootstrap: we have to include a lot of intelligence for designing the initial version, and for the temporary additions during the following stages. We have also to evaluate and monitor the realization of this bootstrap with methods different from those rightfully used in all the other scientific domains.

It seems that people outside AI have more confidence in the possibility of a singularity than those inside AI, which looks like a church whose priests have lost their faith. A recent report, One Hundred Year Study on Artificial Intelligence, defines many interesting  priorities for weak AI. However, they do not strongly believe in strong AI, since they have included this self-fulfilling prophecy:
“No machines with self-sustaining long-term goals and intent have been developed, nor are they likely to be developed in the near future.”
Naturally, I disagree. Moreover, during the search for singularity, we will develop a succession of systems, which will be more general, and could sometimes be more efficient, than those obtained with weak AI.

Even if we are not sure to succeed, we must try it before our too limited intelligence leads our civilization to a catastrophic failure.```

# The singularity Part II

```Walsh's interesting paper considers six arguments against the singularity.

The fast thinking dog argument
Computers are fast. I agree that it is not fundamental for achieving our goal. Intelligence is more than considering many possibilities as fast as possible. If one handles them badly, one can waste a lot of time. However, it can be very useful.

The anthropocentric argument
Many suppose that human intelligence is something special, and they assume that it is enough to design a system which could reach the singularity. Here again, I completely agree with Walsh: our intelligence is only a particular form of intelligence, which evolution allows us to have. Why could this state allow us to realize systems very much clever than ourselves? And even if we create them, it will perhaps be not enough to reach the singularity.

The meta-intelligence argument
The capacity to accomplish a task must not be confused with the capacity to improve the capacity for accomplishing tasks. With present methods, excellent results have been obtained in several domains; however, the systems have always been realised by teams of many experts; it is not an AI system that solves the problem. Therefore, if a system is learning to play Go, it does not learn to write better game playing programs. An improvement at the basic level, solving a particular problem, does not lead to an improvement at the meta-level, solving large families of problems.
However, there are exceptions: CAIA uses the same methods for solving a problem than for solving some meta-problems. For instance, it finds symmetries in the formulation of a particular problem. Finding the symmetries of a problem (which is a meta-problem) will improve CAIA's performances for solving this problem. In this case, it is bootstrapping.
Unfortunately, this situation happens rarely. The reason is that most of the meta-problems are not defined as the problems solved by AI systems, which have a well-defined goal. Usually, the goal of a meta-problem is vague: can we tell that the monitoring of the search for a solution is perfect? We are glad to have solved it: we feel that we have not wasted too much time, but is it possible to do it better? Their goals cannot be defined as well as checkmate in chess. For achieving a bootstrap successfully, one must solve many meta-problems, where one is interested in the way problems are solved.  They are often very different from the problems for which AI researchers have developed efficient methods. However, learning to monitor the search for a solution would be useful for many problems, including this meta-problem itself: a virtuous circle would be closed. This is a part of the singularity.

The diminishing return argument
It often happens that we have very good results when we begin the study of a family of problems. This explains the hyper-optimistic predictions made in the beginning of AI: we did not see that forto progressing just a little more, a huge amount of work is necessary. Here, I do not completely agree: it may happen that discontinuities suddenly entail an impressive progress. For instance, the appearance of the reflexive consciousness brought an enormous discontinuity of the intelligence for the living beings. It is one of the main reasons of the existing gap between the intelligence of the smartest animals and that of the man. Other kinds of discontinuities may exist, which can also lead to an extraordinary increase of the performances. It is difficult to predict when it is going to arrive, no more than a dog can understand our reflexive consciousness.
Self-consciousness is precisely a domain where we can predict a discontinuity in the performances of AI systems, without any idea of when it is going to occur. Indeed, for us, it is a wonderful tool, but it is very limited: the largest part of what takes place in our brain is unconsciously made. Moreover, we have difficulty observing what is conscious because we do not manage to store it. Yet, we can give to our AI systems many possibilities in this domain: CAIA can study all of its knowledge, it can observe all the steps of its reasoning that it could want to, it can store any event. Naturally, it is impossible to observe constantly everything, but it is possible to choose anything among what happens. The difficulty is that I do not know how CAIA could use these capacities efficiently: I have no model because humans cannot do this. Therefore, I am only using them for debugging. Super-consciousness is an example of what could someday be given in the future AI systems; for the present time, the instructions for use are still missing. This is one of the improvements that could lead to AI systems with behavior as incomprehensible for us as ours is incomprehensible for dogs.

The limits of intelligence argument.
The intelligence of living and artificial beings have limits. This is well known since the limitations theorems such as Gödel incompleteness: some sentences are true, and there does not exist a proof showing that it is a theorem. It is possible that it is the case with a sentence as simple as Goldbach conjecture. However, this does not mean that it is impossible to go considerably further than what we achieve now.

The computational complexity argument
For some problems, even very much faster computers would never be able to solve them with the combinatorial method: there are too many branches. This is true, but it is possible that these problems could be solved by a non combinatorial method. Let us consider the magic squares NxN, with N odd. When N is very large, we cannot use the combinatorial method: there are 2N+2 constraints, each of them has N+1 unknowns, which can take any value among N² possible values. If N=100,001, there are 200,003 constraints, each of them with 100,002 unknowns with 10,000,200,001 possible values. This is a very hard problem, even if we are using heuristics for reducing the size of the tree.
Nevertheless, by 1700, a Belgium canon discovered a non combinatorial method that directly generated the values for all the unknowns. I wrote, a small C program (only 26 lines) that generated a solution in 333 seconds. Therefore, is it impossible that, for many problems apparently insoluble with the combinatorial approach, a super-intelligent system would discover a method for finding solutions without any combinatorial search? Complexity is related to an algorithm, but one may solve this problem without using a combinatorial algorithm.```

# The Singularity Part I

In the Fall issue of AI Magazine, Toby Walsh has written an excellent paper on the singularity, that is the time where an AI system could improve its intelligence without our help. I am trying for more than 30 years to bootstrap AI, that is to realize such a system, being helped by the limited intelligence of the system itself, even when it has not yet achieved its final goal. Therefore, I am very much interested in this paper. I disagree on a few points; as the progress comes from the discussion, I will give my personal view of the arguments presented within Walsh’s paper. I agree with its conclusion: it might not be close, and personally I am not even sure that we will witness some day this singularity. However, I believe this for reasons which are not always those of the author.

I will start with two points: how can we reach the singularity, and can we measure the intelligence with a number. Then I will consider the six arguments presented in Walsh’s paper.

Toby Walsh does not indicate how this singularity could be reached. I have the feeling that he thinks, as many other AI researchers, that it is enough to bring together many clever and competent AI researchers during many years: perhaps they would be able to achieve their goal. With this method, outstanding programs, such as for Go and Jeopardy!, were realized. I do not think that we could reach the singularity by this method, even if we gather many researchers, very intelligent on our rating scale: I am afraid that their intelligence might not be high enough. In the same way, billions of rats would never be able to play chess. To achieve singularity, we need help, and the only clever systems outside of us are AI systems themselves. The more they progress, the more they will be helpful. Bootstrapping is an essential method in the technological development: we could not build the present computers if the computers did not exist. Bootstrapping is an excellent method for solving very difficult problems; in return, it takes a very long time.

Implicitly, it seems that those who believe in the singularity think that intelligence can be measured by a number; some day, there will be an exponential growth of its value for AI systems. Could such a measure exist? Even for humans, with very similar intelligence, the IQ is not satisfactory. When the intellectual capacities are very different, such a measure has no sense: it is difficult to compare our intelligence with the one of a clever animal, such as a cat. We have possibilities which does not exist in cats, such as the reflexive consciousness. It is extraordinary useful for us, although we can observe only a small part of what occurs in our brain when we are thinking. Therefore, we cannot compare the intelligence of two beings when one has capacities that the other has not. When there is a discontinuity, the intelligences of those before and after this discontinuity are completely different: new mechanisms appear. If the more intelligent being is an AI system, we cannot just consider that it is only super-intelligent. It is something fundamentally new: its intelligence could not be measured on the same scale as ours. We cannot speak of an exponential growth, but of something so different that we cannot use of the same words.

# II. Symmetries help to solve the problem

Sometimes, finding symmetries enable CAIA to solve problems that it could not solve otherwise. In particular, this happens when a proof by cases is necessary. We will try to illustrate that with a family of problems, called TRIPLETS. For these problems, one must find three positive integers A, B, and C, such that the remainder of the division of the product from two of these numbers by the third one is always the integer N. This problem is formulated for CAIA in the following way:

LET CTE N

LET SET E=[1 to PLINF]

(PLINF stands for plus infinite)

FIND VARIABLE A IN E

followed by 5 similar orders for variables B, C, P, Q, R. Finally, we have the six following constraints:

[1] WITH N<A

[2] WITH N<B

[3] WITH N<C

[4] WITH A*B=P*C+N

[5] WITH A*C=Q*B+N

[6] WITH B*C=R*A+N

Giving the value of N defines one of the problems of the family, N=12 for the present example.

CAIA looks for the symmetries in this formulation, and it finds 5 of them: A, B, C corresponds to the five other permutations of these three variables: A, C, B and B, A, C and B, C, A and C, A, B and C, B, A. To avoid to generate symmetrical solutions, CAIA adds the three following constraints:

[7] C<=A, [8] C<=B, and [9] B<=A.

In that way, it defines one case of a proof by case: the values of the other cases are automatically defined by the five symmetries. Contrarily to what happens in usual proofs by case, it is sufficient to solve one case.

Then CAIA solves a particular problem, here N=12. I will only give the critical steps of the proof. Eliminating B from constraints 5 and 6, one has:

[10] A*C2=A*R*Q+12*Q+12*C

therefore [11] A divides 12*Q+12*C.

From constraints 5 and 8, one has Q*B<A*C<=A*B, therefore [12] Q<A. From 12 and 7, one finds 12*Q+12*C<24*A. As, from 11, A divides 12*Q+12*C less than 24*A, there are only 23 possibilities:

A=12*Q+12*C or 2*A=12*Q+12*C…..or 22*A=12*Q+12*C or 23*A=12*Q+12*C

After that, CAIA develops 23 branches of the tree, each with one of the preceding constraints. For some of them, there is no solution, for others one or more solutions. This step could be made only with constraints 8 and 7, added for avoiding to generate symmetrical solutions. Without them, CAIA finds constraint 11, but cannot use it; finally, it stops without finding any solution.

Let us consider what happens in one of these branches, for instance when on add constraint [13] 7*A=12*Q+12*C.

Removing A from constraints13 and 4, one gets:

[14] 12*Q*B+12*C*B=7*C*P+84

As constraint 5 can be written: Q*B=A*C-12, one has:

[15] 12*A*C+12*C*B=7*C*P+228

Therefore C divides 228. As C>12, it has one of the following values:

19, 38, 57, 76,114, 228

That makes six new branches for the tree, and CAIA easily finds solutions or contradictions for each of them.

All in all, the tree has 218 leaves. 132 of them are solutions, for instance: A=24, B=18, and C=14, or A=293,892, B=1884, and C=156.

We can check that A*B=293,892*1884=553,692,528=P*C+12=3,549,311*156+12, and the same verifications for B*C and C*A.

With the symmetries, there are 132*6= 792 solutions. One must not consider the other cases, they are already solved since they correspond to symmetrical solutions. Symmetries are very useful when they define a proof by cases: it is sufficient to solve one of them.

The constraints created for avoiding to generate symmetrical solutions have been used twice for proving the key constraint: 12*Q+12*C<24*A. If CAIA does not look for the symmetries, it is not even able to find one of the 792 solutions.