# Bootstrapping CAIA Part III

For the realization of a General Problem Solving system, a number of meta-problems arise. The main idea of bootstrapping is that these meta-problems will be solved by the system itself, in the same way as it solves the problems for which it was designed. CAIA is quite an elaborate system; however, it is not yet able to solve every type of problem, and particularly many kinds of meta-problems. I gradually increase its capacities so that it can solve a wider range of problems. For instance, a useful feature for solving more mathematical problems has been the introduction of unknowns with an infinite set of possible values.

It turns out that two families of meta-problems can be solved by CAIA, using its current formalism:

To find the symmetries from the study of the formulation of a problem. We have already considered several times why this capacity is interesting.

To add new elements to a particular family of problems. A researcher has to test his/its system with problems at various levels of difficulty. Unfortunately, there are often too few of them, or they are not difficult enough: as they have an entertaining goal, very complex problems, necessary for checking the system, may be absent. Therefore, I have associated to several families of problems, the definition of a meta-problem that creates new problems in this family.

Unfortunately, many meta-problems have a definition completely different from the problems currently solved by CAIA, and also by most of the general systems. Some of the meta-problems encountered when one wants to solve a problem are:

How to cope with a new result: keep it or eliminate it?

Is it better to backtrack immediately, or to wait for a little, hoping to find a useful result?

If one decides to backtrack, which set of choices will be considered?

How to select the next rule and the elements to which it applies?

What is the probability that a particular step will succeed?

What would be the interest of the result of a derivation if it succeeds?

In order to succeed a bootstrap, these meta-problems must also be solved by CAIA. For the present time, it solves them by using methods that I have found, and it can neither create, nor modify them. Therefore, the next step is to give it the capacity to solve meta-problems such those I have just mentioned.

Fortunately, one can use methods similar to those already successful for more usual problems. For instance, one can consider two levels of backtrack. At the lowest level, one successively examines the situation after adding one constraint from the set that contains the various possibilities. At the highest level, one successively examines the sets of constraints that are known at this step of the resolution: a meta-backtrackis a backtrack on the backtracks that can be considered.

Looking again at the Triplets problem with N=12, one quickly finds two possibly interesting backtracks. They are defined by the set of constraints that one has to consider successively:

On the one hand: 12Q+12C=24A or 23A or 22A….or A, 24 choices in all.

On the other hand: (B even and C even and R even, and P even) or (B even and C odd and R odd and P even) or…(B odd and C odd and R odd and P odd), 11 choices in all (5 of the 16 possible choices have been eliminated).

Choosing the best way to backtrack is a meta-problem, which can be solved by opting for the first one (a linear equality is better than a parity constraint) or for the second one ( only 11 branches are better than 24, one unknown is also better than three; moreover, at each step, one adds four new constraints rather than only one). However, it is also possible to meta-backtrack: one considers both, then one focuses its efforts on the one whose results are more promising (in this problem, the equality constraints). This allows to solve the problem successfully, even when the evaluation of these backtracks is unsatisfactory.

Many meta-problems may be solved by a set of judgements, whose synthesis will give a value allowing to rate and rank the candidates. This is often sufficient when some uncertainty is allowed because there are not too many possibilities to consider, and computers are very fast. When the decision is very important such as choosing a particular backtrack, these judgements are sometimes too approximate. Either one tries to improve the quality of these judgements, or one meta-backtracks.

Naturally, there remains another meta-problem: to find the judgements that one has to use for solving each kind of meta-problem. For the preceding example, the basic elements for these judgements were general: it is better that there are not too many branches in a backtrack, it is better to add each time several constraints rather than only one, an equality constraint is more selective than a parity constraint, adding a constraint with one unknown is better than adding one with several unknowns. They allow a pre-selection which will be improved by the meta-backtrack. In this particular case, one is not very far from the end of the ascent of the meta-levels necessary for the successful completion of the bootstrap. Many other kinds of meta-problems have to be solved, but bootstrapping AI is perhaps not as hard as it sounded.

# II. Symmetries help to solve the problem

Sometimes, finding symmetries enable CAIA to solve problems that it could not solve otherwise. In particular, this happens when a proof by cases is necessary. We will try to illustrate that with a family of problems, called TRIPLETS. For these problems, one must find three positive integers A, B, and C, such that the remainder of the division of the product from two of these numbers by the third one is always the integer N. This problem is formulated for CAIA in the following way:

LET CTE N

LET SET E=[1 to PLINF]

(PLINF stands for plus infinite)

FIND VARIABLE A IN E

followed by 5 similar orders for variables B, C, P, Q, R. Finally, we have the six following constraints:

[1] WITH N<A

[2] WITH N<B

[3] WITH N<C

[4] WITH A*B=P*C+N

[5] WITH A*C=Q*B+N

[6] WITH B*C=R*A+N

Giving the value of N defines one of the problems of the family, N=12 for the present example.

CAIA looks for the symmetries in this formulation, and it finds 5 of them: A, B, C corresponds to the five other permutations of these three variables: A, C, B and B, A, C and B, C, A and C, A, B and C, B, A. To avoid to generate symmetrical solutions, CAIA adds the three following constraints:

[7] C<=A, [8] C<=B, and [9] B<=A.

In that way, it defines one case of a proof by case: the values of the other cases are automatically defined by the five symmetries. Contrarily to what happens in usual proofs by case, it is sufficient to solve one case.

Then CAIA solves a particular problem, here N=12. I will only give the critical steps of the proof. Eliminating B from constraints 5 and 6, one has:

[10] A*C2=A*R*Q+12*Q+12*C

therefore [11] A divides 12*Q+12*C.

From constraints 5 and 8, one has Q*B<A*C<=A*B, therefore [12] Q<A. From 12 and 7, one finds 12*Q+12*C<24*A. As, from 11, A divides 12*Q+12*C less than 24*A, there are only 23 possibilities:

A=12*Q+12*C or 2*A=12*Q+12*C…..or 22*A=12*Q+12*C or 23*A=12*Q+12*C

After that, CAIA develops 23 branches of the tree, each with one of the preceding constraints. For some of them, there is no solution, for others one or more solutions. This step could be made only with constraints 8 and 7, added for avoiding to generate symmetrical solutions. Without them, CAIA finds constraint 11, but cannot use it; finally, it stops without finding any solution.

Let us consider what happens in one of these branches, for instance when on add constraint [13] 7*A=12*Q+12*C.

Removing A from constraints13 and 4, one gets:

[14] 12*Q*B+12*C*B=7*C*P+84

As constraint 5 can be written: Q*B=A*C-12, one has:

[15] 12*A*C+12*C*B=7*C*P+228

Therefore C divides 228. As C>12, it has one of the following values:

19, 38, 57, 76,114, 228

That makes six new branches for the tree, and CAIA easily finds solutions or contradictions for each of them.

All in all, the tree has 218 leaves. 132 of them are solutions, for instance: A=24, B=18, and C=14, or A=293,892, B=1884, and C=156.

We can check that A*B=293,892*1884=553,692,528=P*C+12=3,549,311*156+12, and the same verifications for B*C and C*A.

With the symmetries, there are 132*6= 792 solutions. One must not consider the other cases, they are already solved since they correspond to symmetrical solutions. Symmetries are very useful when they define a proof by cases: it is sufficient to solve one of them.

The constraints created for avoiding to generate symmetrical solutions have been used twice for proving the key constraint: 12*Q+12*C<24*A. If CAIA does not look for the symmetries, it is not even able to find one of the 792 solutions.

# Symmetries Part I

Finding symmetries is often essential for solving a problem. However, it is necessary to specify what is meant by “symmetry”, to indicate how one can find them, and finally to show what they are used for.

There are several kinds of symmetries. I am only interested here in symmetries that are associated to a permutation of the unknowns of a problem: when each unknown is replaced by the corresponding unknown in the permutation, both sets of constraints are identical. For every solution, one has immediately a symmetrical solution when one gives to each unknown the value of its corresponding unknown in the permutation. Therefore, for each solution, one also has as many new solutions as there are symmetries. In CAIA surprised me Part II, we have seen that CAIA had found 47 symmetries for a magic cube problem: every new solution generates 47 symmetrical solutions. Now, we will see how finding symmetries is useful for solving problems.

The first reason is that it considerably decreases the number of solutions that one has to find: for the magical cube, the number of solutions we are seeking is divided by 48. Moreover, symmetrical solutions are usually no longer interesting when one has found one of them, the user considers them equivalent; one only indicates their number. However, when there is no evident geometrical interpretation, it is sometimes difficult to see that two solutions are symmetrical: giving them is not always a complete waste of time. Naturally, CAIA adds constraints to the problem formulation so that it does not generate its symmetrical solutions. This is useful since the more a problem is constrained, the easier it is easy to solve.

The second reason is that it makes the search for new constraints easier: when one has found a constraint, all the constraints generated by the symmetries are also true. For example, with the magic cube problem, the main step in its successful resolution was the discovery on one constraint with only three unknowns: F(13)+F(14)+F(15)=42. A combinatorial search is more efficient with constraints with three unknowns rather than with constraints with 10 unknowns, which happens for all the constraints on the definition of this problem. Therefore, it is possible to apply the 47 symmetries already found. Two of them immediately give a constraint: F(11)+F(14)+F(15)=42 and F(5)+F(14)+F(23)=42. Unfortunately, one has not 47 new constraints: all the other symmetries give one of these three constraints. Without this, it is not evident that the system could also find the last two constraints: it does not consider all the possible derivations and, in any case, that would require much more time.

The third reason was a pleasant surprise. I did not expect it: CAIA could solve some problems after finding their symmetries, while it did not find any solution when it did not search for them before. The explanation is that it happens that the only (or the easiest) way to prove the goal is using a proof by cases. However, difficult decisions have to be made in order to define the constraints that define subsets of the possible values of the unknowns. Fortunately, symmetries offer these constraints on a plate: they have been added so that CAIA finds only one symmetrical solution. Moreover, usually, for a proof by cases, one must successively consider all the cases; it is sufficient to consider one of the cases! I have no room to explain this here, but I will give an example in the following post: CAIA easily finds all the solutions of a particular problem when it has discovered its symmetries; otherwise, it finds nothing.

Unfortunately, besides these positive points, sometimes there is a drawback: too many symmetries. One can waste a lot of time for finding them. This may happen when a problem has billions of solutions; it may also have millions of symmetries. In that situation, finding symmetries have no interest; the difficulty is to find at least one solution.

Searching for symmetries is a small, but important, part of CAIA. This is a meta-problem, that is a problem helping to solve other problems. Therefore, it is an important step in the bootstrap: CAIA finds symmetries, and this improves its performances. It was not necessary to write new modules for finding symmetries. It was sufficient to define the problem of finding symmetries in the same way as the other problems submitted by the users.

# A mathematician is more than a theorem prover

I always believed that AI systems will become excellent mathematicians. Until now, AI focused its efforts on theorem proving, the most visible part of the mathematical activity. However, there are a lot of things to do other than proving theorems; one of them is to make conjectures, then he/she/it usually tries to prove it. This is not always the case, the main activity of EURISKO, developed by Douglas Lenat, was to make conjectures, which it did not prove. Ramanujan, a mathematician of genius, worked in an analogous manner: for many years after his death, distinguished mathematicians were still trying to prove some of his conjectures. Several conjectures are famous, such as Fermat’s last theorem that has been proved more than three centuries after it was stated. Goldbach suggested another well-known conjecture: any even integer greater than 2 can be written as the sum of two primes. Some conjectures, such as Riemann hypothesis, played a great role in the history of mathematics, theories have been developed in order to prove them.

In the February 2016 issue of Artificial Intelligence, C. Larson and N. Van Cleemput describe a system able to make conjectures in many mathematical domains. This system is not associated to a particular theory. Their work is based on a heuristic found and experimented since 1986 by S. Fajtlowicz. I did not know it, and this is not surprising, little was published about it: before this paper, it had never been referenced in Artificial Intelligence.

This heuristic may be used in many domains but not in all: the theory must include objects that have real number invariants, that are numbers linked to objects. For a graph, the invariants may be the number of vertices, of edges, the maximum degree of any of the vertices, etc. For a natural number, there are the number of factors, the number of distinct primes in its factorization, the number of its representations as sum of two primes, etc.

The system stores some objects of the theory under study; their number may be very low, for instance less than 10. For each object, it can find the value of all its invariants; therefore, it can check whether a new conjecture is true for all the given objects. The user indicates the name of an invariant I, for which he would like to conjecture the value of its lower or higher bound, using the other known invariants I1, I2, I3, …., In . The system knows a set of functions that can be applied to numbers such as addition, product, logarithm, etc. It will generate many conjectures such as C1 for I>I1+I2, C2 for I<=I1*I3, C3 for I<(I2-I4)², C4 for I>=log(I5+3),…The user may suggest other functions, especially those that are known to be useful in the present theory. In the different domains considered in the paper, it uses the following functions: frobenius norm, euler phi, mertens, next prime, and so on. This is the only situation in which the system is adapted to a particular domain.

When it has found a new conjecture, the system accepts it if it passes two tests:

Truth test: the conjecture is true for all the stored objects.

Significance test: for at least for one of the stored objects, the conjecture gives a stronger result than all the other known conjectures for the same object.

To clarify the interest of the second test, let us assume that we are looking for conjectures about the invariant G(n) when n is even: G(n) is the number of representations of n as sum of two primes. At the start, one has stored only two objects: 16 and 24. G(16)=2 (3+13 and 5+11) while G(24)=3 (5+19, 7+17, and 11+13). Conjecture C1 is: G(n) is greater than or equal to the number of digits in the binary representation of n less the number of its divisors. It is true for both objects: G(16)>=2 (5-3) and G(24)>=-1 (5-6). Now, conjecture C2 is: G(n) is greater than or equal to the number of digits of its decimal representation minus 1; this is also true since G(16)>=1 and G(24)>=1. However, C1 is stronger for 16 since it indicates that G(16)>=2 instead of 1 for C2; on the contrary, C2 is stronger for 24 where it predicts at least 1 instead of -1. Therefore, so far, one keeps both conjectures.

Naturally, finding a conjecture is not enough: one must prove it, and the conjecture must also be useful. For proving Goldbach conjecture (CG), that is G(n)>0 when n is even, one must find a new conjecture, which can be proved as a theorem, such that its value for G(n) is always at least 1. Even if C1 is true, it is not sufficient, since it only gives G(24)>-1. Therefore, it is certainly not a useful conjecture for proving CG, although it could give (if true) an interesting lower bound for some values of n. We are more lucky with C2: G is positive for both stored objects, and it has been also checked for many even numbers, but it has not yet been proved that the number of representations of n as sum of two primes is greater than or equal to the number of digits of n minus one. I give here only two very simple conjectures; the system found more of them, and checked them with much more than two stored objects. Perhaps, one day, it will find a conjecture predicting G(n)>0 that could be proved; then CG will also be proved.

The initial step in the research for a useful conjecture can be made with very few objects, then the system checks carefully that there is no trivial exception. For instance, after being discovered with few objects, conjecture C2 has been checked for n up to 1,000,000. After finding a useful and reasonably credible conjecture, a human or artificial mathematician has to prove it. Unfortunately, it may be impossible, even for a very clever artificial mathematician, to prove a conjecture because no proof exists although it is always true.

Anyway, I am impressed by the results from this system, particularly with conjecture C2. Firstly, this gives other ideas for proving CG; although it is much stronger, this could lead to consider new pathways. Moreover, if it is proved, it would be a satisfactory (although not very useful) result. For instance, C2 immediately states that:

G(1234567899876543210)>=18

which means that there exists at least 18 representations of this huge number as sum of two primes. With CG, we would only know that there is at least one representation. If I was a mathematician, I would be very proud if I could prove C2!

We can experiment this system, available at nvcleemp.github.io/conjecturing . Their program is implemented in the Sage open-source mathematical computing environment. Sage has a lot of built-in invariants for various mathematical objects. Therefore, it is easy to check with many instances, whether it is most likely that a promising conjecture is true.

This work does not solve all the problems linked to conjectures, but it shows that it is possible to add a new brick to the realization of what will someday be an artificial mathematician.

# Bootstrapping CAIA Part II: Choosing a new domain

There are many ways to increase the range of problems that CAIA could solve. For instance, one can introduce continuous variables, or define unknowns with an infinite number of possible values. It would also be interesting to consider meta-problems, that can be convenient to CAIA when it is solving a problem, for instance to monitor the search for solutions. Meta-problems are particularly useful in a bootstrap; however, when I made my last choice, I was afraid that it was too early to include such meta-problems: I didn’t have enough experience for monitoring the search for a solution. For this reason, I chose to consider problems where the unknowns may have an infinite number of possible values.

Firstly, this allows to have a large number of problems, mainly in arithmetic. More importantly, this necessitates to find new methods, different from those used for solving constraint satisfaction problems. Indeed, for these problems, the combinatorial method always gives all the solutions; it may be practically unusable, especially when the problem is intractable. However, this is a starting point, which can be improved.

Naturally, when an unknown may have an infinite number of possible values, one can no longer use the combinatorial method. Other methods must be used, some of them similar to theorem proving. This does not prevent to develop a tree, for instance one can consider several possibilities for an unknown: x<0, x=0, and x>0, or also x is even and x is odd. For many problems, looking at all the possible values, even very cleverly, is not always the best method; CAIA is obliged to experiment with other approaches in this new domain.

When the system stops, the result from these problems may be:

The proof that there is no solution: find two integers a and b such that 4a+3b²=34.

The discovery of all the solutions: find three integers a, b, et c, greater than a given integer k, such that the three remainders of the division of a.b by c, of b.c by a, and of c.a by b are all equal to k. For instance, if k=12, one solution is a=293892, b=1884, and c=156; in all, there are 792 solutions.

The discovery of one or several families describing an infinite number of solutions, which include all the possible solutions: find two integers a and b such that 4a+3b²=36 (any integer value for x, a=9-3x², b=2x).

There is a solution for every combination of the possible values of the unknowns: find n such that 2006 divides 2005n-1887n-1954n+1836n ; this constraint is true for any positive integer value of n.

No solution, or a finite, or an infinite number of solutions, have been found, but one has not proven that there is no other solution. This happens when one is stuck for at least one leaf of the tree.

For the combinatorial search in constraint satisfaction problems, sometimes one has not all the results because the resolution may require too much time, especially when the problem is intractable. Here, there is a new possibility: one does not see what could be done, all the known methods have been unsuccessfully used. In the tree, we have not only as value for the leaves: solution, or contradiction, or not enough time, but also: one does not know what to do.

When a system results from a bootstrap, too often it does not find some solutions, although it believes that it has completed its task. When CAIA finds a solution, it is always correct; however, as the search is not as systematic as when the set of possible values are finite, sometimes it erroneously concludes that there is a contradiction, or it wrongly eliminates a possible value for an unknown. When a system builds itself its method for finding solutions, a misguided modification in the knowledge that creates these methods can improve its performances in some situations, but may introduce particularly vicious mistakes in parts that led before to excellent results. I have experimented with systems where I gave knowledge, and with others where I also gave meta-knowledge creating knowledge. Undeniably, developing the second approach is tremendously more difficult. One reason is that, if AI systems are brittle, systems that result from a bootstrap are much more brittle. In another post, we have seen the problems coming from meta-bugs.

Naturally, to proceed further, it is necessary to introduce new kinds of problems, until it can solve any problem. As of now, I do not want to consider other families of basic problems, such as games or various mathematical theories. I prefer to add families of meta-problems that will be useful at the meta level, where one has to build new methods for solving problems, or to monitor efficiently the search for solutions. It is much more necessary for quickly moving forward in this bootstrap, and particularly to have a less brittle system: experience has shown that, when CAIA does something for me, it does it better than myself.

# Bootstrapping CAIA Part I: The initial domain

I can’t just say “one must bootstrap AI”, how to do it must also be explained. Naturally, I will use my own experience in CAIA’s development, started in 1985.

I shall quickly go through the first step, where I defined a language and knowledge for translating itself into C programs. This is well known by those who write a compiler in the language of this compiler. It was interesting to define a new language rather than using an existing language for two reasons:

1. This language must change over time, so that it becomes more and more declarative. At the beginning, in order to facilitate the compilation, it has many procedural aspects. They are gradually replaced by more declarative possibilities. Declarativity is essential because it is easier for the system to create declarative rather than procedural knowledge; it is also easier to study its own knowledge when it is given in a declarative formalism. Very important elements in this language are sets and bags, which do not imply an order to be followed, contrarily to the lists. Expertises are sets of rules, and rules have sets of clauses.

2. It is important that CAIA and myself could thoroughly examine any module of CAIA when it is executed. An unrestricted access to the present state of CAIA and its knowledge by CAIA itself is essential if we want to give it a kind of consciousness. It is easier to have something that suits us when the system is specially built for this purpose. Black boxes are great foes of intelligence: they restrict consciousness since one does not know what happens when one executes them. There are still two main black boxes for CAIA: the operating system and the C compiler.

Using its knowledge, CAIA translates all its knowledge, either given in the initial formalism or in more declarative formalisms that I later introduced. Since thirty years, CAIA does not contain a single line of C that I have written. All in all, there are 500,000 lines of C, and 13,500 rules. Many rules have not be created by myself, but from rules that create rules.

AI essential goal is to create a general problem solving system; all the human activities are, in fact, problems that we have to solve. The most important problem for AI researchers is to realize a system that could solve every problem, including this last one. It is foolish to begin with the most difficult problem, it is better to consider simpler problems, then to extend this domain gradually. This is one of the main directions of a bootstrap, the other one being to improve the performances.

It is important to choose the initial domain well. CAIA started to solve problems defined by a set of constraints. Firstly, this domain is interesting because many problems may be defined in that way; but it is even more interesting because some of these problems may be used by the solver itself. Since a long time, I have added two such problems:

1. To find symmetries from the study of the formulation of a problem can be stated as a constraint satisfaction problem. This is useful because it reduces the size of the search space, it facilitates the search for new constraints, and it enables to find a decomposition of the search space so that proofs will be easier in each area. I will explain this last point in a future blog, I completely underestimated it: it was a pleasant surprise.

2. For experimenting with a system, it needs many problems. Often, there are not enough problems in the literature, or I have to give them myself; moreover, there is a lack of very, very difficult problems, which are far too hard for human beings. However, finding new problems is a problem that can be defined with constraints; therefore, many CAIA’s problems have been found by CAIA itself.

In the next blog, we will see how a first step has been made to extend this initial domain. When CAIA will be able to solve any problem, one direction of the bootstrap will be completed; however, it will also be necessary to solve these problems more and more efficiently.

# Should we disclose everything? Part I

When a human or an artificial being has solved a problem, must he indicate all that he has done? Obviously, one must not describe everything, too much information would be worthless. However, is it sufficient to indicate only what is enough to check that a solution is correct? For my part, I don’t think so; we will see some aspects of this problem in the following posts. Here, we will only show that it is possible and useful to explain why one has chosen the steps taken when solving a problem, even those that are not necessary to justify the solution.

Usually, people tell nothing about this kind of knowledge, often because the solver does not know it: these mechanisms are largely unconscious. A chess player knows what moves he has considered, but he rarely knows why he has considered them. A pity, because de Groot’s work has shown that the strength of the strongest chess players derives from their ability to consider the best moves. In an experiment, world champion Alekhine had to think aloud while he was finding the best move in a complex position; he considered only four moves, among them the two best ones. Unfortunately, he did not say why he thought to consider them, probably because he did not know it: looking at a chess board automatically gave him the moves to consider. On the contrary, a good player had considered nine moves for the same position, but the best one was not among them.

Likewise, mathematics teachers rarely indicate why the steps of a proof have been chosen, and also why ways that seem promising do not lead toward the solution. This is unfortunate, this cause some misgivings about the method used by mathematicians: it is normal to wander, the best mathematicians have not directly found the results that have made them famous. Some students do not even try to find the solution of a problem because they believe that they are too incompetent: indeed, it is impossible to find only the good steps, in the way the teacher presents the proof of a theorem. What is important is to try, to fail, and to try again.

I was most impressed by the description, given by Martin Gardner, of the stages that have resulted in the resolution of the squaring the square problem by four maths students. The goal was to find a square covered with squares, the sizes of the small squares being all different. When we are looking for a solution, it is easy to check whether it is correct; this does not indicate how this square has been found. In fact, the students began to find a rectangle covered by small squares, all different. They enjoyed it a lot, and they built a rectangular box containing all these wooden squares. One of them, who lived at his parents, let it on his desk, and his mother was not entitled to clean this desk. Naturally, she could not resist cleaning it and, in doing so, the puzzle fell to the ground. Her disobedience being obvious, she tried to put the squares back into the box, and she succeeded, hoping to escape her son’s reproaches. However, her son discovered that his mother has found a solution different from the preceding one. Immediately, he called his friends, and they tried to understand why this puzzle had several solutions. That gave them the idea of an analogy with Kirchoff’s laws, the size of a square corresponding to the value of a resistance in an electrical circuit. They realized that the circuit must have some characteristics, and that led them to the discovery of the first squared square.

Knowing why some rules have or have not been executed would also be useful for CAIA: it could be used to improve its meta-knowledge used for selecting, prohibiting, or ordering the possible actions. In its explanation of a solution, as CAIA can completely observe any of its actions, it knows the reasons for the elimination of a result, for its choice, for the priority allocated to each action. Unfortunately, if it can know how meta-knowledge was used, it is not so easy to improve it: one must define meta-meta-knowledge, which is always very difficult to create and to use.

When an AI system uses an algorithm that tries everything possible, there is no need to explain the choice of its steps. However, CAIA dynamically determines whether it will keep or eliminate a result, or whether it will use it immediately or save it for difficult times. It could produce an interesting explanation of its actions such as: it did not keep this new constraint because it contained many unknowns, or because it was an inequality and it already had too many of them, or it delayed using this rule because its execution is time-consuming, and so on.

For an explanation to a human being, it is possible to include some indications on the reasons why CAIA has chosen to try an action; the difficulty is to model this human for giving him only what he needs to know. On the other hand, such explanations given by CAIA to itself would allow it to improve its own meta-knowledge. Both changes could be made; unfortunately, as they are difficult to implement, so far they are only in my to-do list.

# 55 years of Artificial Intelligence

It was in October 1960 that I started to work on my thesis. One year ago, I was appointed deputy chief of a computer department, which existed since almost ten years, in a military laboratory. It had an impressive documentation on computers, and it was a very supportive environment: the same year, the head of the department started working on automatic translation from Russian to French. I was thrilled by the first AI realizations, such as Newell and Simon’s Logic Theorist, Samuel‘s machine learning using the game of checkers, Gelertner‘s geometry-theorem proving machine, and so on.

For my thesis, I wanted to implement a general theorem prover that received as data any of the many propositional logic axiomatizations. It had to discover interesting theorems in each theory, without any information on its existing theorems.

The first difficulty to overcome was to find a thesis director: at that time, Artificial Intelligence, and even computer science, were not taught at Paris University. Luckily, Professor Jean Ville was interested in computers, although he essentially worked on probability, statistics, and economics. He was very kind to accept that I registered at the University for this thesis.

Looking at the results of the initial version of my program, I was surprised to see that it had discovered proofs different from those given in logic books. These original proofs showed me that it could be interesting to use meta-theorems, that is new ways for proving theorems. Therefore, I gave my program the possibility to prove meta-theorems, and the modified program found more results, and also proofs that were easier to understand. The results of this program were not bad; for a particular axiomatization, it proved results for which Lukasiewicz said: “One must be very expert in performing such proofs.” Results found for one of these axiomatizations can be found at page 134 of Laurière’s book (page 125 for the French version).

I was greatly impressed by these results: since then, I have always tried to realize systems that have the ability to work at the meta-level. This is a challenging task, since their results are compared with systems where the work at the meta-level has been made by their author, and not by the system itself. For the present time, the performances of a system working at the meta-level are not better than those found by other programs, but human intelligence is less important, they have a larger degree of autonomy. The primacy of men over animals mainly comes from our capacity to work at a meta-level, consciousness is one example of this superiority. I cannot believe that it is possible to create a superintelligence without this ability.

Moreover, this ability allows to bootstrap AI: existing AI systems will help us to implement more powerful systems. I believe that AI is the most difficult science, perhaps far too complex for human intelligence. The ideal would be to have an Artificial Artificial Intelligence Researcher; this is CAIA’s long-term goal.

Since 30 years, I am developing CAIA. At the moment, it has 13,000 rules that transform themselves into 500,000 lines of C; I have not written a single line of the present programs, and many rules have also be written by CAIA. I continue to replace the expertises created by myself, by meta-expertises that create the preceding expertises. The bootstrap will be completed when CAIA includes a set of meta-expertises that could generate all its expertises and meta-expertises, including themselves.

I am trying to create more and more declarative, and more and more general knowledge: I prefer to give CAIA meta-knowledge for creating expertise E rather than writing myself expertise E. It is difficult but, when I succeed, CAIA’s version is usually better than my initial version.

There is still a very long way before this bootstrap is completed. I have not 55 more years for completing it, but I hope that other researchers will pursue this tremendously difficult task.

# When a Belgium canon meets a French mathematician

Let us consider again magic squares, I want to show that it is possible to find solutions for very large squares, using methods completely different from the improved combinatorial method used by CAIA in the preceding post.

In 1703, canon Poignard, from Brussels, found an algorithm that generates solutions for magic squares when N is odd. This method was improved by mathematician Philippe de la Hire, who issued in 1705 a communication to the Académie Royale des Sciences. This method is well known; it was already described in Diderot’s Encyclopedie, entry ‘quarrés magiques’.

For describing this method, I will consider N=5. Let us choose a permutation of the numbers from 1 to N, for instance: 2,3,5,1,4. We write it on the first line of the square. For the second line, we put this permutation shifted by P elements; P is a number less than N such that P and P-1 must be relatively prime numbers with N. This implies that the method cannot be used for even squares: as either P or P-1 is even, N cannot be even. If one chooses P=3, the second line will be 5,1,4,2,3. The process continues, and we get the square just down left.

In the same way, we consider a permutation of the integers from 0 to N-1, multiplied by N, for instance: 5,20,0,15,10 which is written on the first line of another square. This new square is completed as the preceding one, with a shift Q, which must also be different from P. With Q=4, we generate the square on the right above.

Both squares have exactly once each number on each line, each column, and each diagonal; moreover, the sums on these lines are the same, since we have the same permutation. Naturally, there is no bijection, since there are only N different numbers for the N2 little squares of each square.

We create the magic square by adding these two squares. It has always all the preceding characteristics, and each value between 1 and N2 is present exactly once.

An interesting remark: if opposite sides of the square are considered as next to one another, all the diagonals have also the common value for their sum. We do not generate only magic squares, but a kind of ‘hyper-magic’ square that has 4N constraints, including N diagonals to the right and N diagonals to the left. The new problem is much more constrained than the original one. This explains why there is no hyper-magic square for N=3, where one cannot find a couple P and Q.

This way of generating magic squares is extraordinary: we do not need to write the constraints, the way the solution is created automatically satisfies them. It is no more necessary to find the value of SUM. It is also unnecessary to backtrack.

Without using a computer, I found a magic square for N=11 in 16 minutes. Furthermore, I wrote a 25 lines C program that generates such squares. In only 333 seconds, it generated (but not stored) a solution for N=100,001. It had to find a value for 10,000,200,001 little squares! It is unthinkable to solve such a problem with the methods used by CAIA, and many other constraint satisfaction systems: there are 200,002 constraints, each one has 100,002 unknowns, and the number of possible values for each unknown is greater than ten billion. Studying the formulation of this problem, canon Poignard transformed a constraint satisfaction problem into an algorithm building a large permutation from simpler permutations.

CAIA’s problem solver is very far from such performances. This does not mean that it is an impossible task for an AI system, but that our systems must have a greater capacity for working at the meta level, studying and transforming the formulation of the problem: they must be able to do what canon Poignard did more than 300 years ago.

# CAIA is not a good magician

Too often, IA researchers describe the successes of their system, but they keep a lower profile when they get poor outcomes. I do not want to comply to this bad habit. I describe here CAIA’s behavior when success does not come, in the present case, the magic square problem. An order N Magic Square has N horizontal lines, and N vertical lines. Therefore, there are N2 squares, and each one contains a number between 1 and N2, so that each number appears only once. Moreover, the sum of the verticals, the horizontals, and the two diagonals, must have the same unknown value, SUM.

First, CAIA finds symmetries, which are permutations of the unknowns which lead to an identical set of constraints. Seven symmetries are evident for all magic squares: three rotations and four reflections with respect to horizontal, vertical, and both diagonals. CAIA knows nothing about geometry, but it can find all symmetries, both geometric and non-geometric. To do that, it uses the general component that solves constraint satisfaction problems. I had only to specify one particular problem: finding all the permutations of the unknowns leading to the initial set of constraints.

I hoped that it would find the seven geometrical symmetries. However, CAIA was far beyond any of my expectations: it found a huge number of symmetries that I never imagined existed. Even for N=4, it finds 31 symmetries! Some are completely unexpected, for instance, one of them transforms the first horizontal into the second vertical: both contain the same values, naturally not in the same order.

Finding this is not evident: almost 400 years ago, Bernard Frénicle de Bessy performed an outstanding study of the N=4 magic square. In particular, he defined a standard representation of the solutions, which is always used. However, he has found 880 basic solutions; multiplied by 8, this gives a total of 7040 solutions. CAIA found only 220 basic solutions; multiplied by 32, we find again 7040. When N grows, the number of symmetries increases at a startling rate: for N=10, CAIA finds 15,359 symmetries! On this point, CAIA’s behavior is flawless, although a lot of computer time is needed: 31,193 seconds for all the symmetries when N=10.

In the following step, CAIA tries to find other constraints. For the magic problem, it adds that the sum of the unknowns is equal to the sum of the first N2 integers, since there is a bijection. After that, it can find the value of SUM. For us, humans, this is easy because we are seeing the square: it is clear that the sum of the N horizontal lines is N*SUM. It is also evident that this sum contains all the squares, so its value is the sum of all the numbers between 1 et N2, that is N2*(N2+1)/2. Therefore, we have SUM=N*(N2+1)/2.

I did not give CAIA this formula, it has to find the value of SUM from the set of constraints. For this task, it is rather good: for N=16, it finds SUM=2056, after many substitutions in the 34 initial constraints, plus the bijection constraint. It would have been better to prove the general formula, but it is difficult for a blind system, which cannot see the square as we do. It does not find other constraints, and I do not believe that constraints with less than N unknowns exist for this problem. In total, for this task, CAIA has done well overall, although it could not set out the general formula.

The last part is to find the solutions, and this is the point where CAIA fails: it has been designed for finding all the solutions of a problem. For this reason, it is very interesting to find symmetries. However, it makes no sense to find all the solutions when there are billions of them. In its normal working, it considers the formulation of the problem, and particularly it finds new constraints, easier to use than the initial ones. In some cases, (as for N=3), it directly finds the solution. When there is no new constraint, it is often better to stop trying to be clever, but to write cleverly a combinatorial program that generates all the solutions.

For a magic square, this works well for the small values of N: for N=8, it generates the first 1,000 solutions in 4 seconds. Unfortunately, the search space dramatically increases with N: the number of unknowns grows as N2, and the number of possible values for each unknown also grows as N2. Very soon CAIA is completely overcome. The method still works for N=9: 10 seconds for the first 1,000 solutions. Unfortunately, there is a sharp discontinuity for N=10: after 14 hours, it has not yet found a solution.

The reason for this failure is that it is unthinkable that, given the huge number of solutions, one can ever find all of them when N is not small. It is sufficient to find one of them. We could think that all we have to do is to launch the search for solutions, and stop when one is found. In reality, it is often better to use other methods for scanning the search space, pinpointing the areas where the probability of finding a solution is high. For instance, one can find solutions for the set of equations when the unknowns may have continuous values. Then, one can look for the combinations of integer values close to the values of these solutions.

Another heuristic may be used, such as to make the initial problem more constrained, adding new constraints so that it would be easier to find solutions for this new problem. If it has solutions, they would also be solutions of the initial problem. When there are zillions of solutions, we can add constraints that remove some of them. In a future post, we will see how this could be very efficient for finding solutions for some magic square problems.

To sum up, although a problem may have many solutions, CAIA experiences serious difficulties when all its constrains has many unknowns, each unknown having many possible values; this is normal, the search space is huge. In this situation, it can, nonetheless, have excellent results when it can generate new constraints with few unknowns; this happened with the magic cube, where it found constraints with only three unknowns. If this is impossible, it must use completely different methods, good for finding a needle in a huge haystack. CAIA improves the combinatorial search with a meta-combinatorial search generating new constraints; this is sometimes not enough for solving this kind of problem, some modules must be added to CAIA.