Category Archives: AI Researcher

Should we disclose everything? Part I

 When a human or an artificial being has solved a problem, must he indicate all that he has done? Obviously, one must not describe everything, too much information would be worthless. However, is it sufficient to indicate only what is enough to check that a solution is correct? For my part, I don’t think so; we will see some aspects of this problem in the following posts. Here, we will only show that it is possible and useful to explain why one has chosen the steps taken when solving a problem, even those that are not necessary to justify the solution.

Usually, people tell nothing about this kind of knowledge, often because the solver does not know it: these mechanisms are largely unconscious. A chess player knows what moves he has considered, but he rarely knows why he has considered them. A pity, because de Groot’s work has shown that the strength of the strongest chess players derives from their ability to consider the best moves. In an experiment, world champion Alekhine had to think aloud while he was finding the best move in a complex position; he considered only four moves, among them the two best ones. Unfortunately, he did not say why he thought to consider them, probably because he did not know it: looking at a chess board automatically gave him the moves to consider. On the contrary, a good player had considered nine moves for the same position, but the best one was not among them.

Likewise, mathematics teachers rarely indicate why the steps of a proof have been chosen, and also why ways that seem promising do not lead toward the solution. This is unfortunate, this cause some misgivings about the method used by mathematicians: it is normal to wander, the best mathematicians have not directly found the results that have made them famous. Some students do not even try to find the solution of a problem because they believe that they are too incompetent: indeed, it is impossible to find only the good steps, in the way the teacher presents the proof of a theorem. What is important is to try, to fail, and to try again.

I was most impressed by the description, given by Martin Gardner, of the stages that have resulted in the resolution of the squaring the square problem by four maths students. The goal was to find a square covered with squares, the sizes of the small squares being all different. When we are looking for a solution, it is easy to check whether it is correct; this does not indicate how this square has been found. In fact, the students began to find a rectangle covered by small squares, all different. They enjoyed it a lot, and they built a rectangular box containing all these wooden squares. One of them, who lived at his parents, let it on his desk, and his mother was not entitled to clean this desk. Naturally, she could not resist cleaning it and, in doing so, the puzzle fell to the ground. Her disobedience being obvious, she tried to put the squares back into the box, and she succeeded, hoping to escape her son’s reproaches. However, her son discovered that his mother has found a solution different from the preceding one. Immediately, he called his friends, and they tried to understand why this puzzle had several solutions. That gave them the idea of an analogy with Kirchoff’s laws, the size of a square corresponding to the value of a resistance in an electrical circuit. They realized that the circuit must have some characteristics, and that led them to the discovery of the first squared square.

Knowing why some rules have or have not been executed would also be useful for CAIA: it could be used to improve its meta-knowledge used for selecting, prohibiting, or ordering the possible actions. In its explanation of a solution, as CAIA can completely observe any of its actions, it knows the reasons for the elimination of a result, for its choice, for the priority allocated to each action. Unfortunately, if it can know how meta-knowledge was used, it is not so easy to improve it: one must define meta-meta-knowledge, which is always very difficult to create and to use.

When an AI system uses an algorithm that tries everything possible, there is no need to explain the choice of its steps. However, CAIA dynamically determines whether it will keep or eliminate a result, or whether it will use it immediately or save it for difficult times. It could produce an interesting explanation of its actions such as: it did not keep this new constraint because it contained many unknowns, or because it was an inequality and it already had too many of them, or it delayed using this rule because its execution is time-consuming, and so on.

For an explanation to a human being, it is possible to include some indications on the reasons why CAIA has chosen to try an action; the difficulty is to model this human for giving him only what he needs to know. On the other hand, such explanations given by CAIA to itself would allow it to improve its own meta-knowledge. Both changes could be made; unfortunately, as they are difficult to implement, so far they are only in my to-do list.

55 years of Artificial Intelligence

It was in October 1960 that I started to work on my thesis. One year ago, I was appointed deputy chief of a computer department, which existed since almost ten years, in a military laboratory. It had an impressive documentation on computers, and it was a very supportive environment: the same year, the head of the department started working on automatic translation from Russian to French. I was thrilled by the first AI realizations, such as Newell and Simon’s Logic Theorist, Samuel‘s machine learning using the game of checkers, Gelertner‘s geometry-theorem proving machine, and so on.

For my thesis, I wanted to implement a general theorem prover that received as data any of the many propositional logic axiomatizations. It had to discover interesting theorems in each theory, without any information on its existing theorems.

The first difficulty to overcome was to find a thesis director: at that time, Artificial Intelligence, and even computer science, were not taught at Paris University. Luckily, Professor Jean Ville was interested in computers, although he essentially worked on probability, statistics, and economics. He was very kind to accept that I registered at the University for this thesis.

Looking at the results of the initial version of my program, I was surprised to see that it had discovered proofs different from those given in logic books. These original proofs showed me that it could be interesting to use meta-theorems, that is new ways for proving theorems. Therefore, I gave my program the possibility to prove meta-theorems, and the modified program found more results, and also proofs that were easier to understand. The results of this program were not bad; for a particular axiomatization, it proved results for which Lukasiewicz said: “One must be very expert in performing such proofs.” Results found for one of these axiomatizations can be found at page 134 of Laurière’s book (page 125 for the French version).

I was greatly impressed by these results: since then, I have always tried to realize systems that have the ability to work at the meta-level. This is a challenging task, since their results are compared with systems where the work at the meta-level has been made by their author, and not by the system itself. For the present time, the performances of a system working at the meta-level are not better than those found by other programs, but human intelligence is less important, they have a larger degree of autonomy. The primacy of men over animals mainly comes from our capacity to work at a meta-level, consciousness is one example of this superiority. I cannot believe that it is possible to create a superintelligence without this ability.

Moreover, this ability allows to bootstrap AI: existing AI systems will help us to implement more powerful systems. I believe that AI is the most difficult science, perhaps far too complex for human intelligence. The ideal would be to have an Artificial Artificial Intelligence Researcher; this is CAIA’s long-term goal.

Since 30 years, I am developing CAIA. At the moment, it has 13,000 rules that transform themselves into 500,000 lines of C; I have not written a single line of the present programs, and many rules have also be written by CAIA. I continue to replace the expertises created by myself, by meta-expertises that create the preceding expertises. The bootstrap will be completed when CAIA includes a set of meta-expertises that could generate all its expertises and meta-expertises, including themselves.

I am trying to create more and more declarative, and more and more general knowledge: I prefer to give CAIA meta-knowledge for creating expertise E rather than writing myself expertise E. It is difficult but, when I succeed, CAIA’s version is usually better than my initial version.

There is still a very long way before this bootstrap is completed. I have not 55 more years for completing it, but I hope that other researchers will pursue this tremendously difficult task.

When a Belgium canon meets a French mathematician

 Let us consider again magic squares, I want to show that it is possible to find solutions for very large squares, using methods completely different from the improved combinatorial method used by CAIA in the preceding post.

 In 1703, canon Poignard, from Brussels, found an algorithm that generates solutions for magic squares when N is odd. This method was improved by mathematician Philippe de la Hire, who issued in 1705 a communication to the Académie Royale des Sciences. This method is well known; it was already described in Diderot’s Encyclopedie, entry ‘quarrés magiques’.

 For describing this method, I will consider N=5. Let us choose a permutation of the numbers from 1 to N, for instance: 2,3,5,1,4. We write it on the first line of the square. For the second line, we put this permutation shifted by P elements; P is a number less than N such that P and P-1 must be relatively prime numbers with N. This implies that the method cannot be used for even squares: as either P or P-1 is even, N cannot be even. If one chooses P=3, the second line will be 5,1,4,2,3. The process continues, and we get the square just down left.

In the same way, we consider a permutation of the integers from 0 to N-1, multiplied by N, for instance: 5,20,0,15,10 which is written on the first line of another square. This new square is completed as the preceding one, with a shift Q, which must also be different from P. With Q=4, we generate the square on the right above.

 Both squares have exactly once each number on each line, each column, and each diagonal; moreover, the sums on these lines are the same, since we have the same permutation. Naturally, there is no bijection, since there are only N different numbers for the N2 little squares of each square.

We create the magic square by adding these two squares. It has always all the preceding characteristics, and each value between 1 and N2 is present exactly once.

 An interesting remark: if opposite sides of the square are considered as next to one another, all the diagonals have also the common value for their sum. We do not generate only magic squares, but a kind of ‘hyper-magic’ square that has 4N constraints, including N diagonals to the right and N diagonals to the left. The new problem is much more constrained than the original one. This explains why there is no hyper-magic square for N=3, where one cannot find a couple P and Q.

This way of generating magic squares is extraordinary: we do not need to write the constraints, the way the solution is created automatically satisfies them. It is no more necessary to find the value of SUM. It is also unnecessary to backtrack.

 Without using a computer, I found a magic square for N=11 in 16 minutes. Furthermore, I wrote a 25 lines C program that generates such squares. In only 333 seconds, it generated (but not stored) a solution for N=100,001. It had to find a value for 10,000,200,001 little squares! It is unthinkable to solve such a problem with the methods used by CAIA, and many other constraint satisfaction systems: there are 200,002 constraints, each one has 100,002 unknowns, and the number of possible values for each unknown is greater than ten billion. Studying the formulation of this problem, canon Poignard transformed a constraint satisfaction problem into an algorithm building a large permutation from simpler permutations.

 CAIA’s problem solver is very far from such performances. This does not mean that it is an impossible task for an AI system, but that our systems must have a greater capacity for working at the meta level, studying and transforming the formulation of the problem: they must be able to do what canon Poignard did more than 300 years ago.

CAIA is not a good magician

 Too often, IA researchers describe the successes of their system, but they keep a lower profile when they get poor outcomes. I do not want to comply to this bad habit. I describe here CAIA’s behavior when success does not come, in the present case, the magic square problem. An order N Magic Square has N horizontal lines, and N vertical lines. Therefore, there are N2 squares, and each one contains a number between 1 and N2, so that each number appears only once. Moreover, the sum of the verticals, the horizontals, and the two diagonals, must have the same unknown value, SUM.

First, CAIA finds symmetries, which are permutations of the unknowns which lead to an identical set of constraints. Seven symmetries are evident for all magic squares: three rotations and four reflections with respect to horizontal, vertical, and both diagonals. CAIA knows nothing about geometry, but it can find all symmetries, both geometric and non-geometric. To do that, it uses the general component that solves constraint satisfaction problems. I had only to specify one particular problem: finding all the permutations of the unknowns leading to the initial set of constraints.

I hoped that it would find the seven geometrical symmetries. However, CAIA was far beyond any of my expectations: it found a huge number of symmetries that I never imagined existed. Even for N=4, it finds 31 symmetries! Some are completely unexpected, for instance, one of them transforms the first horizontal into the second vertical: both contain the same values, naturally not in the same order.

 

Finding this is not evident: almost 400 years ago, Bernard Frénicle de Bessy performed an outstanding study of the N=4 magic square. In particular, he defined a standard representation of the solutions, which is always used. However, he has found 880 basic solutions; multiplied by 8, this gives a total of 7040 solutions. CAIA found only 220 basic solutions; multiplied by 32, we find again 7040. When N grows, the number of symmetries increases at a startling rate: for N=10, CAIA finds 15,359 symmetries! On this point, CAIA’s behavior is flawless, although a lot of computer time is needed: 31,193 seconds for all the symmetries when N=10.

In the following step, CAIA tries to find other constraints. For the magic problem, it adds that the sum of the unknowns is equal to the sum of the first N2 integers, since there is a bijection. After that, it can find the value of SUM. For us, humans, this is easy because we are seeing the square: it is clear that the sum of the N horizontal lines is N*SUM. It is also evident that this sum contains all the squares, so its value is the sum of all the numbers between 1 et N2, that is N2*(N2+1)/2. Therefore, we have SUM=N*(N2+1)/2.

I did not give CAIA this formula, it has to find the value of SUM from the set of constraints. For this task, it is rather good: for N=16, it finds SUM=2056, after many substitutions in the 34 initial constraints, plus the bijection constraint. It would have been better to prove the general formula, but it is difficult for a blind system, which cannot see the square as we do. It does not find other constraints, and I do not believe that constraints with less than N unknowns exist for this problem. In total, for this task, CAIA has done well overall, although it could not set out the general formula.

The last part is to find the solutions, and this is the point where CAIA fails: it has been designed for finding all the solutions of a problem. For this reason, it is very interesting to find symmetries. However, it makes no sense to find all the solutions when there are billions of them. In its normal working, it considers the formulation of the problem, and particularly it finds new constraints, easier to use than the initial ones. In some cases, (as for N=3), it directly finds the solution. When there is no new constraint, it is often better to stop trying to be clever, but to write cleverly a combinatorial program that generates all the solutions.

For a magic square, this works well for the small values of N: for N=8, it generates the first 1,000 solutions in 4 seconds. Unfortunately, the search space dramatically increases with N: the number of unknowns grows as N2, and the number of possible values for each unknown also grows as N2. Very soon CAIA is completely overcome. The method still works for N=9: 10 seconds for the first 1,000 solutions. Unfortunately, there is a sharp discontinuity for N=10: after 14 hours, it has not yet found a solution.

The reason for this failure is that it is unthinkable that, given the huge number of solutions, one can ever find all of them when N is not small. It is sufficient to find one of them. We could think that all we have to do is to launch the search for solutions, and stop when one is found. In reality, it is often better to use other methods for scanning the search space, pinpointing the areas where the probability of finding a solution is high. For instance, one can find solutions for the set of equations when the unknowns may have continuous values. Then, one can look for the combinations of integer values close to the values of these solutions.

Another heuristic may be used, such as to make the initial problem more constrained, adding new constraints so that it would be easier to find solutions for this new problem. If it has solutions, they would also be solutions of the initial problem. When there are zillions of solutions, we can add constraints that remove some of them. In a future post, we will see how this could be very efficient for finding solutions for some magic square problems.

To sum up, although a problem may have many solutions, CAIA experiences serious difficulties when all its constrains has many unknowns, each unknown having many possible values; this is normal, the search space is huge. In this situation, it can, nonetheless, have excellent results when it can generate new constraints with few unknowns; this happened with the magic cube, where it found constraints with only three unknowns. If this is impossible, it must use completely different methods, good for finding a needle in a huge haystack. CAIA improves the combinatorial search with a meta-combinatorial search generating new constraints; this is sometimes not enough for solving this kind of problem, some modules must be added to CAIA.

CAIA surprised me Part III The results

 After finding symmetries, CAIA tries to find solutions to the magic cube problem, applying rules that enrich its formulation.

One of these rules states that, when there is a bijection between a set of unknowns and a set of values, the sum of the unknowns equals the sum of the values. Therefore, CAIA adds the constraint:

F(1)+F(2)+f(3)+…..+F(26)+F(27)=1+2+3+…..+26+27=378

When it substitutes the first three constraints in the preceding one, it sees that 3*VAL=378, so:

VAL=126

Using this value, and after many substitutions in the constraints, it finds several constraints with four unknowns, such as:

F(5)+F(23)=F(13)+F(15)

Later, it discovers a constraint with three unknowns:

 F(5)+F(14)+F(23) = 42

Using the symmetries, it immediately adds two other constraints with three unknowns:

F(11)+F(14)+F(17) = 42

F(13)+F(14)+F(15) = 42

All the other symmetries lead to one of the three preceding constraints.

Finally, it happens that it has no more rule to apply. CAIA can choose between two possibilities:

1. It chooses an unknown, and it examines what happens when it substitutes it by one of its possible value. It carries on this task, instantiating an unknown, then generating new constraints, until it finds a contradiction or a solution. In both cases, it backtracks, instantiating the last unknown by the following possible value. CAIA is not very fast: it finds a solution every three seconds, taking into account that a new solution immediately leads to 47 other solutions, using the symmetries.

2. It writes a combinatorial program, cleverly using the constraints that it has found. This C program has 280 lines. CAIA compiles it, and executes it. Then, it finds 130 solutions every second.

In both cases, it chooses to begin backtracking with F(14), the small cube in the center of the large one. There are many solutions for each of the 27 possible values for F(14).

Here is one of these solutions:

 

 Its success is mainly due to its analysis of the problem formulation: it finds symmetries, and collects useful new constraints, particularly the three constraints with three unknowns. I did not know them myself, I had only found constraints with four unknowns. Naturally, when one knows that they exist, it is easy to prove them.

A human researcher, who would specifically solve this problem, could realize a more efficient system. However, the intelligence would be in his head, not in his system.

 I wanted to compare CAIA with other problem solvers. Francis Sourd kindly gave this problem to two of the best general solvers for problems defined by constraints. ILOG Solver did not find a solution in a reasonable time, while ILOG CEPLEX found the first one after 14 hours. Incidentally, it shows that CEPLEX is excellent: the search space is enormous when it is impossible to manipulate the formulation of the problem. Therefore, it does not know the value of VAL, and all of its constraints have 10 unknowns.

CAIA solved this problem in a way that I did not know. As Pégoud with his plane, we must let our systems fend for themselves!

 I doubt that, at the moment, there exists a general system that could approach the performances of CAIA for this problem, when it is placed in the same conditions: knowing only to find a bijection satisfying the 15 constraints given in part I. If I am wrong, I would be very interested in knowing more on this system.

CAIA surprised me Part II The symmetries

 For the cube problem described in the preceding post, CAIA begins with looking for symmetries in the problem formulation. What is a symmetry? For Richard Feynman: « A thing is symmetrical if there is something we can do to it so that after we have done it, it looks the same as it did before. »

There are many kinds of symmetries. CAIA only tries to find one of them: a permutation of the unknowns leads to the same set of constraints when each unknown is replaced by the corresponding one.

To do this, I defined a combinatorial problem: finding all the permutations of the unknowns that satisfy the preceding condition. This problem is given in the formalism used for all the combinatorial problems solved by CAIA. The interest of the bootstrap clearly appears here: knowing how to solve combinatorial problems allows us to solve problems necessary for solving combinatorial problems. For every new problem, CAIA solves a preliminary problem: finding which symmetries exist. Its data are the initial constraints of the main problem.

We are not restricted to the case where each constraint C is transformed in itself by the permutation, such as when one permutes A and B in the constraint A+B>0. Usually, most constraints are transformed into other ones.

For the cube problem, CAIA finds 47 solutions: for each solution that it finds, it can immediately add 47 new solutions.

One of these solutions is the following permutation:

 All the classical geometrical symmetries are among these 47 solutions: to the center, to the mean planes, etc. However, it found them from the constraints: it does not know geometry; the method can be used for problems where there is no geometrical interpretation. For instance, if unknowns A and B appear only in the constraint A+B=1,000,000 exchanging A and B gives the same constraint. Naturally, CAIA can often add new constraints for avoiding to generate a solution and its symmetrical ones. In the preceding example, if the values of A and B are different, adding constraint A<B will remove all the symmetrical solutions.

For three main reasons, CAIA needs finding symmetries:

1. This decreases the size of the search space, because constraints are added to prevent finding symmetrical solutions.

2. This increases the speed of the resolution process: for every new solution, one can immediately generate all its symmetrical solutions. For the cube, the system is 48 times faster once it has found the symmetries.

3. It is easier to create constraints: for each new constraint, CAIA adds all the constraints that are obtained by symmetry. Let us notice that it does not generate 47 other constraints for each constraint: as few unknowns are in a constraint, many of the constraints generated in that way are identical.

 

If one believes to help the system by adding constraints to the initial formulation, this is a mistake: this may hide some symmetries. For instance, one may be proud to have proven the constraint F(17)+F(11)=F(5)+F(23). Unfortunately, adding this constraint in the initial formulation will remove most of the symmetries, including the one given above. Indeed, the constraint that would be created with this symmetry is F(11)+F(17)=F(15)+F(13); it is not among the initial constraints of this new formulation. As it often happens, when one has a clever system, it is not good to try to help it: one may complicate its task. Let us trust CAIA to do its job.

Finally, for me, the main interest of finding symmetries is to show concretely that bootstrapping is possible, and can be very useful: realizing a solver for problems defined by a set of constraints improves the solver itself.

In the next post, we will see how CAIA uses these symmetries for solving the magic cube problem.

CAIA surprised me Part I

 An AI researcher can always arrange his system so that it can find a solution that he already knows. To do that, he chooses a formulation well adapted to this solution, he includes all the methods necessary for this solution, and he indicates to avoid the roads that do not lead to the right direction. For this reason, it is always interesting to see a system finding a solution unknown by its author.

 We have already seen how ALICE, developed by Jean-Louis Laurière, had immediately solved a problem proposed by professor Schutzenberger. I will now describe how CAIA has solved a problem better than I would have done.

 Let us consider a kind of magic cube: the goal is to find a bijection F between the 27 small cubes that are in a 3x3x3 large cube, and the integers between 1 and 27. The sums of the planes containing nine cubes must have the same value VAL. There are 15 such planes: 3 horizontal, 6 vertical, and 6 diagonal. The small cubes are numbered in the following way, the first array is for the upper face, and the third one for the lower face, as shown below:

 

 

 

 

 

The three constraints for the horizontal planes are:

F(1)+F(2)+F(3)+F(4)+F(5)+F(6)+F(7)+F(8)+F(9) = VAL

F(10)+F(11)+F(12)+F(13)+F(14)+F(15)+F(16)+F(17)+F(18)=VAL

F(19)+F(20)+F(21)+F(22)+F(23)+F(24)+F(25)+F(26)+F(27)=VAL

The six constraints for the vertical planes:

F(1)+F(2)+F(3)+F(10)+F(11)+F(12)+F(19)+F(20)+F(21)=VAL

F(4)+F(5)+F(6)+F(13)+F(14)+F(15)+F(22)+F(23)+F(24)=VAL

F(7)+F(8)+F(9)+F(16)+F(17)+F(18)+F(25)+F(26)+F(27)=VAL

F(1)+F(4)+F(7)+F(10)+F(13)+F(16)+F(19)+F(22)+F(25)=VAL

F(2)+F(5)+F(8)+F(11)+F(14)+F(17)+F(20)+F(23)+F(26)=VAL

F(3)+F(6)+F(9)+F(12)+F(15)+F(18)+F(21)+F(24)+F(27) = VAL

 

The last six constraints are for the diagonal planes:

 F(1)+F(2)+F(3)+F(13)+F(14)+F(15)+F(25)+F(26)+F(27)=VAL

 F(7)+F(8)+F(9)+F(13)+F(14)+F(15)+F(19)+F(20)+F(21)=VAL

 F(1)+F(4)+F(7)+F(11)+F(14)+F(17)+F(21)+F(24)+F(27) = VAL

 F(3)+F(6)+F(9)+F(11)+F(14)+F(17)+F(19)+F(22)+F(25)=VAL

 F(1)+F(5)+F(9)+F(10)+F(14)+F(18)+F(19)+F(23)+F(27)=VAL

 F(3)+F(5)+F(7)+F(12)+F(14)+F(16)+F(21)+F(23)+F(25)=VAL

In this old version of CAIA, I have indicated that the value of VAL was between 1 and 300. This was a very small aid; in the future versions, telling that VAL is positive will be enough.

I have already drawn attention to the importance of the problem formulation. To sum up, CAIA knows that it must find a bijection of 27 unknowns on the set of integers from 1 to 27, and the value of the positive unknown VAL, less than 300, such that the 15 preceding constraints are satisfied.

One could think that this is an easy problem, since it is under-constrained: there are 28 unknowns and only 15 constraints, plus the fact that one has a bijection. It has millions of solutions. However, the search tree is very large since every F has 27 possible values, and 300 for VAL. Moreover, each constraint contains 10 unknowns, and one must instantiate 9 of them to get a value for the tenth: many instantiations have to be made before one sees a contradiction. From this point of view, the problem is extremely difficult to solve: there are many solutions, but they are hidden in billions of possible instantiations for the unknowns.

We will see in the following posts how CAIA solved this problem, beginning with the discovery of its symmetries.

DONALD revisited

When one claims that an AI system has solved a problem, it is important to specify how this problem has been defined: what is the problem really solved.

Let us again consider the DONALD problem. I chose to give CAIA a formulation using the carries that we are using when we are performing an addition. Nevertheless,  many other formulations are possible, and I tried another one: each number is represented by the sum of its digits multiplied by the correct tenth power. In that situation, apart the constraints indicating that the first letter of each row is not zero, only one constraint remains. For DONALD, it is:
100000*D+10000*O+1000*N+100*A+10*L+D+100000*G+10000*R+1000*A
+100*A+10*L+D=100000*R+10000*O+1000*B+100*E+10*R+T

CAIA also solved the problem given in that way, but it's not as good: in the most favorable situation, where it tries to develop a very small tree, the tree developed has 15 leaves instead of only 2 with the carries. When it starts with a combinatorial program, the tree has 5,544,820 leaves instead of 262!

Therefore, it is obvious that the choice of a formulation of a problem can drastically modify the performances of a solver.

AI researchers spend a lot of time finding the best formulation for their system. They can even add constraints that they have themselves found. Particularly, they could give useful constraints found by CAIA for instance, indicating that E has only two possible values, 0 and 9. In the experiments described in Thinking, Frederic Bartlett was more helpful, indicating that the value of D was 5. With this information, CAIA directly finds the solution: as D=5, T=0 therefore, E is not zero, which leaves only 9 as a value for E, and the end is easy.

I could increase the possibilities of CAIA for modifying the initial formulation. For instance, I would give it more rules using modular arithmetic so that it could find the formulation with carries from the one given above. I did not do it because it would be useful for just two families of problems: crypt-additions and crypt-multiplications, while CAIA has more than 200 families of problems. I try to avoid giving  too many ad-hoc rules, which are useful for few families.

Finding a good formulation is not always easy, it requires a large problem solving culture. Bartlett gave the problem in a visual form:
  DONALD
 +GERALD
  ROBERT
This formulation is very different from the preceding ones, but all the successful solvers in this Bartlett's experiment mentioned at least once the carries. Therefore, introducing carries is a natural step.

Evidently, the formulation using carries is better for solving the problem: with the other formulation, there is one constraint with nine unknowns. Eight instantiation are necessary for finding the value of the ninth unknown. With the carries, there are constraints with 3, 4, or 5 unknowns. Knowing the value of two unknowns in a three unknowns equality immediately gives the value of the third one. The combinatorial search is more efficient because one can see sooner the consequences of a choice. With the intelligent method, one creates new constraints with few unknowns.

For this reason, being clever does not reduce very much the size of the tree when one has received a good formulation of a problem: with the carries, 2 leaves instead of 232. On the other hand, the improvement when one uses the intelligent method is spectacular with the formulation without carries: 12 leaves instead of 5,544,820 leaves.

Finding a good formulation is fundamental for quickly solving a problem. AI researchers choose a formulation well suited for the method that they have defined; therefore, they are working for their system, doing the most difficult and important part of the job. The intelligence is in the researcher, and not in the AI system. I did it myself, choosing a good representation for crypt-additions, using the carries. However, as CAIA generates new constraints, it can move far away from the initial formulation.

Problems are often defined in a natural language, or with a drawing as Bennett did for DONALD. Unfortunately, as we will see in future posts, it is difficult to extract the useful information. Moreover, ambiguities are often present, especially when one uses a natural language. Removing them can lead to very different problems: which is the good one?

When DONALD and GERALD meet ROBERT

 A crypt-arithmetic problem is defined by one (or several) equations, where the digits are represented by letters. One has to find the value associated with each letter so that the operation is true; different letters must have different values. Moreover, the first letter of each number must be different from zero. The crypt-additions, where the only operation is the addition, are a special case of these problems. One of the more popular crypt-addition is:

DONALD+GERALD=ROBERT

 Frederic Bartlett was a pioneer of the study of human problem solving. In the 1950s, he asked his subjects to solve this problem; for facilitating their task, the value of letter ‘D’ was given. In spite of this, many of them failed. If the value of ‘D’ is not known, this problem is much more difficult. The unique solution is: 526485+197485=723970.

CAIA has received the description of a general crypt-addition. It is important to know what formulation has been chosen for describing a problem. In that situation, I have chosen to introduce the carries. One must find an injection between the letters and the numbers from 0 to 9 that satisfies the constraints given by each column. For DONALD, that generates the following six main constraints; C0 to C6 are the carries, which can only take the values 0 and 1 for this addition:

C1+D+G=R+10*C0

C2+O+E=O+10*C1

C3+N+R=B+10*C2

C4+2*A=E+10*C3

C5+2*L=R+10*C4

C6+2*D=T+10*C5

and also: C6=C0=0, and D#0, G#0, R#0 (first letters of each line).

 As computers are very fast, I can ask CAIA to write a combinatorial program. In one second, it writes this program, compiles and executes it: it contains 111 C instructions, but it can only solve the DONALD problem. It develops a tree with 262 leaves, and finds the unique solution. This is very satisfactory: the size of the search space is roughly 10 billions. When CAIA writes a combinatorial program, it does it cleverly. It is easy to check this solution, but it is more difficult to be sure that no solution is missing. We have seen that a meta-bug may create bugs in the generated program.

CAIA can also begin with a meta-combinatorial search, executing rules that modify the formulation of the problem. I call this approach ‘intelligent’ because its results look like those found by an intelligent human. In the present case, CAIA discovers that it has to find a bijection rather than an injection: there are 10 letters for 10 digits. This is very useful: for instance, if a digit D can be the value of only one letter L, then one is certain that the value of L is D. CAIA generates also new constraints, such as T is even, but the most useful is to state that E has only two possible values instead of 10: 0 and 9. Unfortunately, this meta-combinatorial search stops because all the rules have been applied.

In that situation, CAIA may decide to write a combinatorial program, leaving for good the meta-combinatorial search. It writes this program, including all the constraints that it has found, with particular attention to including the fact that it has now to find a bijection. It executes it, and it finds that there is only one solution after generating a tree, now with 192 leaves. The new program has almost the same size as the preceding one (118 C instructions), but both programs are completely different.

Another possibility for CAIA is to resume the combinatorial search after instantiating E with each of its two possible values. Then, it shows that E=0 leads to a contradiction, while E=9 leads to the solution. In both cases, it directly finds the result: now, the tree has only two leaves.

One may wonder whether it is worth using the intelligent approach: the computer times are almost the same, the time necessary to generate and execute the combinatorial program is approximately the same as the time for enriching the formulation of the problem. However, the results of the meta-combinatorial search may be checked and explained. Moreover, for other problems, the time required for the combinatorial approach is unacceptable.

The solution found by CAIA was a pleasant surprise. ALICE, probably the first AI system using a meta-combinatorial search, found a solution with 6 leaves. Jean-Louis Laurière improved it to only 4 leaves. He subsequently asked hundreds of students to study this solution, and no one ever suggested that it was possible to generate a smaller tree. For its part, CAIA made unexpected decisions: a crucial step in the solution was the use by CAIA of the new constraint:

10*E+C3+N+D+G=99*C1+B

which was essential for the discovery of the contradiction in the case E=0.

It is often useful to mix both approaches: one begins with the meta-combinatorial search, and one finishes with the combinatorial search. In some situations, the best way is to examine many situations as fast as possible. Therefore, CAIA must become meta-intelligent: It will acknowledge that, sometimes, it is better not to be too intelligent, but rather to stop the meta-combinatorial search. This happens when one can foresee that the time spent in a meta-combinatorial search will not lead to a significant improvement of the formulation. Then, the best move is to write a combinatorial program, but one must generate this program in a very intelligent way.

The meta-combinatorial search, much better than the combinatorial search

 The solution of a problem often depends upon a set of variables, whose all possible values are known; the combinatorial search considers which combinations of these values satisfy all the constraints of the problem. This can also be used when there is a finite number of legal actions, and one considers the situations after all the legal sequences of these actions. This search is widely used in AI, for instance for chess programs. It frequently gives useful results: with their speed, the computers can develop trees much larger than those that we could ever generate.

However, even when this method succeeds, we are not satisfied: the only way to understand a solution found in that way is to know that everything has been considered, and one has found nothing else. We cannot check such solutions. Moreover, it cannot be used when the search space is very large, and always when there is an infinite number of values; this often happens in solving arithmetic problems.

Instead of considering all the possible values for each variable, one can also consider a combination of the rules that can create new constraints, enriching the formulation of the problem. In many cases, the new problem may become very easy to solve.

In order to clarify this idea, let us consider how CAIA solves a particular arithmetic problem: to find all the integers m and n satisfying the equation 4m+3n2=34 .

One cannot use the combinatorial search, there are an infinity of possible values for m and n. Here is CAIA’s solution:

3n2 is the difference of two even numbers, so it is even.

As 3n2 is even, and 3 is odd, n2 is even.

Therefore n est even.

Then n2 is a multiple of 4,

and 3n2 is a multiple of 4.

So, 4m+3n2 is a multiple of 4.

As it must be equal to 34, which is not a multiple of 4, there is a contradiction: no solution.

Naturally, at each step, CAIA must find and apply an element of its set of rules, for instance: in an equation, if a member is a multiple of integer I, then the other member is also a multiple of I. Another useful rule is: the sum of two expressions that are multiple of integer I is also a multiple of I.

A huge advantage of the meta-combinatorial search over the combinatorial one is that one has not to consider all the possible applications of the rules: one has only to find a sequence that leads to the solution. Many other rules were perhaps considered; if they are not necessary for the proof, it is useless to indicate that one attempted to use them. For instance, it happened that CAIA found that, as the square n2 is a positive or null integer, this is also true for 3n2. Therefore, 4m=34-3n2<=34, and m<9. I do not included this result in the preceding proof, the contradiction has been proved without using it.

The combinatorial search could be used in a variant of this problem where the absolute values of both m and n are less than one billion. Then, one can consider all the possible values of n, and check that 34-3n2 is a multiple of 4. A lot of time would be wasted to find that there is no solution. Proving first that n is even and m<9 would reduce the size of the search space. Indeed, it is often interesting to use both kinds of search: the meta-combinatorial search trims down the problem, then the combinatorial search solves when the size of the search space is reasonable. We will show in future posts how CAIA manages to use both methods.

The mathematician frequently uses a meta-combinatorial search. He may spend years finding the proof of a theorem, writing notebook after notebook, when the final solution takes only a few pages.

When our AI systems will have many methods for enriching the formulation of problems, they will efficiently use the meta-combinatorial search. Some day, the mathematicians will be as surprised as the chess players were some years ago: during many years, they were laughing at the poor performances of the programs, and at their ridiculous moves. For the present time, they are reduced to demand that their artificial opponent will be shackled by the interdiction to use all its capacities; this does not prevent them to lose.