55 years of Artificial Intelligence

It was in October 1960 that I started to work on my thesis. One year ago, I was appointed deputy chief of a computer department, which existed since almost ten years, in a military laboratory. It had an impressive documentation on computers, and it was a very supportive environment: the same year, the head of the department started working on automatic translation from Russian to French. I was thrilled by the first AI realizations, such as Newell and Simon’s Logic Theorist, Samuel‘s machine learning using the game of checkers, Gelertner‘s geometry-theorem proving machine, and so on.

For my thesis, I wanted to implement a general theorem prover that received as data any of the many propositional logic axiomatizations. It had to discover interesting theorems in each theory, without any information on its existing theorems.

The first difficulty to overcome was to find a thesis director: at that time, Artificial Intelligence, and even computer science, were not taught at Paris University. Luckily, Professor Jean Ville was interested in computers, although he essentially worked on probability, statistics, and economics. He was very kind to accept that I registered at the University for this thesis.

Looking at the results of the initial version of my program, I was surprised to see that it had discovered proofs different from those given in logic books. These original proofs showed me that it could be interesting to use meta-theorems, that is new ways for proving theorems. Therefore, I gave my program the possibility to prove meta-theorems, and the modified program found more results, and also proofs that were easier to understand. The results of this program were not bad; for a particular axiomatization, it proved results for which Lukasiewicz said: “One must be very expert in performing such proofs.” Results found for one of these axiomatizations can be found at page 134 of Laurière’s book (page 125 for the French version).

I was greatly impressed by these results: since then, I have always tried to realize systems that have the ability to work at the meta-level. This is a challenging task, since their results are compared with systems where the work at the meta-level has been made by their author, and not by the system itself. For the present time, the performances of a system working at the meta-level are not better than those found by other programs, but human intelligence is less important, they have a larger degree of autonomy. The primacy of men over animals mainly comes from our capacity to work at a meta-level, consciousness is one example of this superiority. I cannot believe that it is possible to create a superintelligence without this ability.

Moreover, this ability allows to bootstrap AI: existing AI systems will help us to implement more powerful systems. I believe that AI is the most difficult science, perhaps far too complex for human intelligence. The ideal would be to have an Artificial Artificial Intelligence Researcher; this is CAIA’s long-term goal.

Since 30 years, I am developing CAIA. At the moment, it has 13,000 rules that transform themselves into 500,000 lines of C; I have not written a single line of the present programs, and many rules have also be written by CAIA. I continue to replace the expertises created by myself, by meta-expertises that create the preceding expertises. The bootstrap will be completed when CAIA includes a set of meta-expertises that could generate all its expertises and meta-expertises, including themselves.

I am trying to create more and more declarative, and more and more general knowledge: I prefer to give CAIA meta-knowledge for creating expertise E rather than writing myself expertise E. It is difficult but, when I succeed, CAIA’s version is usually better than my initial version.

There is still a very long way before this bootstrap is completed. I have not 55 more years for completing it, but I hope that other researchers will pursue this tremendously difficult task.

Is it possible to define a problem?

 We do not pay enough attention to the definition of the problems given to our solvers. Firstly, by selecting a tailored definition to our system, we can give it an unreasonable help. More importantly, ambiguities in the definition may lead to a different problem although we believe that they are always identical.

This is often due to the ambiguities of natural language. However, we will see that different problems come from the same definition because some people add constraints that they feel evident; nevertheless, we have the right to reject these constraints.

The definition of Sudoku does not seem to pose difficulties, it is very straightforward: one must complete a 9×9 grid with numbers from 1 to 9 so that each column, each row, and each of the nine 3×3 subregions contain all the digits from 1 to 9. This definition appears in the magazines that publish these problems. Moreover, a few add another constraint: there is only one solution for this puzzle.

The question is whether one can use this uniqueness constraint for solving Sudoku puzzles. This constraint is respected in all the published puzzles: they always have exactly one solution. However, this can be considered as a constraint only for those who create new puzzles. The requirement of a unique solution exist for many problems; for instance, a chess problem is cooked if there are several solutions. Even then, very difficult problems, such as the magic cube, described in a preceding post, has millions of solutions; as each one is very hard to find, the problem is interesting.

If this constraint must be respected by the author of a Sudoku puzzle, does a human or machine solver have the right to use it? I have never seen a chess problem where the solution uses this uniqueness constraint. Some constraints are necessary for creating a beautiful problem; this does not mean that they can be used for solving it. However, the first question that needs to be asked is: can this constraint be useful for solving Sudoku puzzles?

Of course, knowing the number of solutions of a problem may be very helpful: as soon as all the solutions have been found, one can stop the process. Indeed, no more solutions can be found if the author has not made a mistake. It is very easy to implement this possibility. For instance, a parameter indicates to CAIA the maximum number of solutions that it can find, so that it avoids wasting a lot of time when there are millions of solutions. The default value for this parameter is 50, but one has just to instantiate it to 1: CAIA will stop as soon as it has found one solution.

It is less obvious to use this restriction even before finding a solution. I will illustrate this point in a problem published in the Figaro Magazine. Far too often, after a newspaper has published a Sudoku problem, it only gives the solution: it does not indicate the steps that were taken for finding this solution. Fortunately, the Figaro Magazine also gives an excellent description of the main steps; therefore, it is possible to see which rules have been used. Usually, there are only the constraints given with the definition of the problem, but not the one stating the uniqueness of the solution. However, a problem stated in the 17 July 2015 issue uses it. The following diagram describes the situation after finding the value of three squares. The lines are numbered 1 to 9 from the top.

It is easy to see that the possible values for B2 and C2 are 7 and 9. In the same way: 1, 7, and 9 are the possible values for B5 and C5. (5 cannot be in C5 because in column C, 5 can must be in C7 or C9, the only squares where 5 can be put in the lower left subregion). If 1 were neither in B5, nor in C5, each solution with B5=7 and C5=9 would give a symmetrical solution where 7 and 9 are switched in squares B2, C2, B5, and C5. If there are N solutions with B5=7 in this hypothesis, there would be also N solutions with B5=9. If K is the number of solutions when 1 is in B5 or C5, the total number of solutions would be K+2*N. As the uniqueness constraint states that it is equal to 1, this means that K=1, and N=0: there is no solution without 1 in B5 or C5. This is very useful, that means that D5 is not 1. As in the center subregion 1 is either in D5 or F6, we have F6=1, and it is easy to complete the Sudoku with this information.

                                      A            B         C          D          E          F         G          H          I

 CAIA does not make this deduction: it has not received the uniqueness constraint. For this problem, it finds the same possible values for squares B2, C2, B5, and C5; unfortunately, it cannot deduce the following steps of the Figaro solution. In its solution, it sees that the only values for E1 are 2 and 5. 2 leads to a contradiction, and it directly finds the solution with E1=5. Incidentally, this proves that the value of N is actually zero. However, contrary to the solution given in the Figaro, it has to backtrack once.

I understand that many people want to use the uniqueness constraint, although I do not agree with them for several reasons:

*Real problems may have none or many solutions, and it is interesting to solve them. Why are they giving a special status to artificial problems?

*Even when there is certainly one solution, why transform the search for a solution into a game of chance. The first to find the solution is the lucky one who considers firstly the good value for the unknowns.

*For other problems that have only one solution, as Chess problems, one never uses this uniqueness for finding this solution.

*In many cases, including the Figaro, the given Sudoku definition does not include this constraint.

*If the problem is cooked, nobody will see it.

*The author of the problem must check that it has just one solution: the uniqueness constraint is certainly necessary for creating a puzzle. Naturally, for checking it, he cannot use the fact that the problem has just one solution! Therefore, one must also have a solver that does not use this constraint. Is it necessary to have two solvers for the same problem?

All this shows that, even when the definition of a problem looks evident at first sight, it may be difficult to agree on this definition: the solvers cannot help but interpret this formulation. In this case, the solutions and the solving methods may not be the same as those given in the initial formulation: they do not solve the same problem. Therefore, if it is possible to define a problem, it is certainly not easy. Unfortunately, AI researchers have to give unambiguous definitions to their problem solvers.

Jeopardy!

 For the present time, AI systems are worse than ourselves for some applications, and I believed that the reason was that they have not an associative memory as good as ours; its efficient use is essential for our intuition. In a fraction of a second, it gives us an answer, or allows us to remove an ambiguity. We have already considered this problem in the preceding blog. AI systems rarely use a large corpus of English texts effectively, and don’t do it quickly.

Therefore, I was impressed by the results achieved by Watson, an AI system developed by an IBM team. It can play a Jeopardy! competition, a television game very popular in the States. Several competitors must find as fast as possible the person or the object corresponding to a clue, which is an English sentence in any domain. Watson must first understand the sentence, then find candidate answers in a huge amount of English texts, and finally choose the best match with the clue. All this has to be done in a short time, a few seconds.

Let us consider an example from the match that showed Watson’s superiority. For the clue It’s a poor workman who blames these.“, Watson was the first to find the good answer: tools.

The difficulty comes for three main factors: questions come from a broad domain, the answer must be given with high confidence, and it must be done very fast. Watson does not use Internet, but it has a great deal of English knowledge, including several encyclopedias, among them the full text of Wikipedia; the whole takes four terabytes of storage.

Watson competed against two top champions: Brad Rutter, a 20-time champion, and Ken Jennings, the best player in Jeopardy! history, famous for winning 74 games in a row. Watson won, and won big: $77.147, when the other two only won $21.600 and $24.000.

As Watson includes many modules, I will briefly speak of those that seem the most important for an AI point of view.

In the first step, Watson analyses the clue and extracts keywords and sentence fragments used for finding possible answers from the breadth of its knowledge. This mechanism is a little like our intuition: it is using many heuristics. Its main advantage is that it is very fast. Unfortunately, for us, this often leads to mistakes, because we often merely keep the first response, for lack of time, laziness, or unawareness of the knowledge leading to the result.

For Watson, this step usually gives several possible candidates. Then, far better than us, it will spend a lot of time (that means a few seconds for a computer!) for choosing the most reliable one. It ranks the candidates, and it chooses the first one, provided that it has a sufficient level of confidence. If no one is satisfactory enough, it does not answer.

Many methods enable it to measure its confidence in a particular result. If a result appears several times, Watson will be much more confident in this result. It also tries to evaluate the reliability of a candidate. If the clue is Chile shares its longest land border with this country, it is easy to remove China, which does not share borders with Chile. Two serious candidates are Argentina and Bolivia. The media often speak of the border dispute between Chile and Bolivia; this will tend to favor Bolivia. On the other hand, some results give the lengths of the borders. As these lengths are very different, and these results are rather reliable, Argentina will finally be chosen by the geographic module. However, even when a candidate has most of the wanted characteristics, it will not be chosen if one reliable result forbids it.

Watson may analyze several results and compare their reliability because it is aware of the information used for finding a particular answer. This is a huge advantage for artificial cognition, usually we do not know why our intuition gives a particular result.

Watson correctly answered several difficult clues, for instance clock for Even a broken one of these on your wall is right twice a day.“, and escalator for This clause in a union contract says that wages will rise or fall depending on a standard such as cost of living.

Nevertheless, as human beings, Watson sometimes makes mistakes. The audience had a good laugh when it answered Toronto to the following clue on US towns: Its largest airport is named for a World War II hero; its second largest for a World War II battle.

Realizing a system playing Jeopardy! has shown the possibility of using a huge memory efficiently. Now, the authors of this system want to adapt their methods to the resolution of another problem: medical diagnosis. In that domain, it is also important to give the right answer by reasoning over an unstructured natural language content; moreover, it will have much more useful consequences.

Cows are thirsty

 Several years ago, I was in the subway with one of my children. For keeping him entertained, I asked him a question: “what are the cows drinking?” Knowing me, he did not answer immediately: he sniffed out a trap, it was too easy to answer this question! A passenger, whom I did not know, was surprised by this total ignorance, and he could not refrain from saying: “They are drinking milk, my little child!” This mistake is natural: this is a hoax. It is due to a very useful mechanism, that often enables us to give a correct answer quickly. The question was on a beverage and on cows; as cows are associated with milk, and milk is a beverage, we have the answer immediately.

It is better to avoid systems giving incorrect results. However, if the result is that one gives no answer at all, this is not satisfactory. Many AI researchers have an excellent mathematical training. Therefore they overwhelmingly reject all possibilities of error: if one accepts a contradiction, one can prove everything. Luckily, human beings survive their contradictions.

Methods that can lead to erroneous results may be useful when we do not know a perfect method, or when it is too costly over time. It may be better to have a result quickly rather than waiting centuries for a perfect result. For instance, current Chess programs play very well, better than the world champion, but they certainly do not always play the best move. This was an issue for Edgar Poe, when he was asserting that Maelzel’s Chess Player was not an automaton. He was right: a human chess player was hidden into the machine. Nevertheless, one of his arguments was incorrect: he believed that it was not a machine, because sometimes it lost the game. Sorry, Edgar, a machine may be imperfect. However, we know a perfect method for playing the best chess move: one fully develops the tree of all the possible moves for each player. As the depth of the tree is finite, it can be done in a finite time, and one can find the perfect solution from this tree. Unfortunately, the time necessary for generating it goes beyond anything we can imagine.

Fast mechanisms are used by our unconscious; we call them intuition. Unfortunately, sometimes they are making mistakes. They are also useful for resolving ambiguities in natural language texts. It may be dangerous to use the results found by our intuition: we must cautiously check that they are correct when the consequences could be serious. To do that, our conscious mechanisms verify the results given by our intuition. However, calling them is not automatic: for each result, we must decide whether we will check it.

Intuition is an essential mechanism for human beings, and especially for animals: this is the only way enabling them to take decisions that are not genetically programmed. It always gives results quickly; moreover, we can use it in situations where we do not know other methods. As living beings, artificial beings will have to use similar methods. Nothing prevents us to give them such capabilities. Unfortunately, this is not so easy because we do not know them, since they are unconscious; we have to discover and experiment them in our systems. Another difficulty is that our love of correct methods will reject the bunch of heuristics that makes up intuition.

Fortunately, we have now a clearer idea of the mechanisms called ‘intuition’. For Herbert Simon: “The situation has provided a cue; this cue has given the expert access to information stored in memory, and the information provides the answer. Intuition is nothing more and nothing less than recognition.” I cannot see why AI systems could not use this mechanism.

The horror of error is dangerous: some research directions are neglected because they can lead to dubious results. This favors rigorous mathematical methods in AI, even when their application is limited because they are too time-consuming.

If artificial beings were never making mistakes, that would mean that they would be too restricted. They must not be refrained from using methods that can be very helpful when carefully used. However, we must also give them the possibility to find out when they are wrong: we have to add modules verifying the results. For instance, CAIA checks that all the solutions it finds satisfy all the constraints. Unfortunately, it is possible that it misses solutions, but we have already seen that, sometimes, great mathematicians also miss some of the solutions.

We will progress when our systems believe that cows drink milk, see that this is wrong, then find the right answer, as my embarrassed fellow traveler finally did.

When a Belgium canon meets a French mathematician

 Let us consider again magic squares, I want to show that it is possible to find solutions for very large squares, using methods completely different from the improved combinatorial method used by CAIA in the preceding post.

 In 1703, canon Poignard, from Brussels, found an algorithm that generates solutions for magic squares when N is odd. This method was improved by mathematician Philippe de la Hire, who issued in 1705 a communication to the Académie Royale des Sciences. This method is well known; it was already described in Diderot’s Encyclopedie, entry ‘quarrés magiques’.

 For describing this method, I will consider N=5. Let us choose a permutation of the numbers from 1 to N, for instance: 2,3,5,1,4. We write it on the first line of the square. For the second line, we put this permutation shifted by P elements; P is a number less than N such that P and P-1 must be relatively prime numbers with N. This implies that the method cannot be used for even squares: as either P or P-1 is even, N cannot be even. If one chooses P=3, the second line will be 5,1,4,2,3. The process continues, and we get the square just down left.

In the same way, we consider a permutation of the integers from 0 to N-1, multiplied by N, for instance: 5,20,0,15,10 which is written on the first line of another square. This new square is completed as the preceding one, with a shift Q, which must also be different from P. With Q=4, we generate the square on the right above.

 Both squares have exactly once each number on each line, each column, and each diagonal; moreover, the sums on these lines are the same, since we have the same permutation. Naturally, there is no bijection, since there are only N different numbers for the N2 little squares of each square.

We create the magic square by adding these two squares. It has always all the preceding characteristics, and each value between 1 and N2 is present exactly once.

 An interesting remark: if opposite sides of the square are considered as next to one another, all the diagonals have also the common value for their sum. We do not generate only magic squares, but a kind of ‘hyper-magic’ square that has 4N constraints, including N diagonals to the right and N diagonals to the left. The new problem is much more constrained than the original one. This explains why there is no hyper-magic square for N=3, where one cannot find a couple P and Q.

This way of generating magic squares is extraordinary: we do not need to write the constraints, the way the solution is created automatically satisfies them. It is no more necessary to find the value of SUM. It is also unnecessary to backtrack.

 Without using a computer, I found a magic square for N=11 in 16 minutes. Furthermore, I wrote a 25 lines C program that generates such squares. In only 333 seconds, it generated (but not stored) a solution for N=100,001. It had to find a value for 10,000,200,001 little squares! It is unthinkable to solve such a problem with the methods used by CAIA, and many other constraint satisfaction systems: there are 200,002 constraints, each one has 100,002 unknowns, and the number of possible values for each unknown is greater than ten billion. Studying the formulation of this problem, canon Poignard transformed a constraint satisfaction problem into an algorithm building a large permutation from simpler permutations.

 CAIA’s problem solver is very far from such performances. This does not mean that it is an impossible task for an AI system, but that our systems must have a greater capacity for working at the meta level, studying and transforming the formulation of the problem: they must be able to do what canon Poignard did more than 300 years ago.

CAIA is not a good magician

 Too often, IA researchers describe the successes of their system, but they keep a lower profile when they get poor outcomes. I do not want to comply to this bad habit. I describe here CAIA’s behavior when success does not come, in the present case, the magic square problem. An order N Magic Square has N horizontal lines, and N vertical lines. Therefore, there are N2 squares, and each one contains a number between 1 and N2, so that each number appears only once. Moreover, the sum of the verticals, the horizontals, and the two diagonals, must have the same unknown value, SUM.

First, CAIA finds symmetries, which are permutations of the unknowns which lead to an identical set of constraints. Seven symmetries are evident for all magic squares: three rotations and four reflections with respect to horizontal, vertical, and both diagonals. CAIA knows nothing about geometry, but it can find all symmetries, both geometric and non-geometric. To do that, it uses the general component that solves constraint satisfaction problems. I had only to specify one particular problem: finding all the permutations of the unknowns leading to the initial set of constraints.

I hoped that it would find the seven geometrical symmetries. However, CAIA was far beyond any of my expectations: it found a huge number of symmetries that I never imagined existed. Even for N=4, it finds 31 symmetries! Some are completely unexpected, for instance, one of them transforms the first horizontal into the second vertical: both contain the same values, naturally not in the same order.

 

Finding this is not evident: almost 400 years ago, Bernard Frénicle de Bessy performed an outstanding study of the N=4 magic square. In particular, he defined a standard representation of the solutions, which is always used. However, he has found 880 basic solutions; multiplied by 8, this gives a total of 7040 solutions. CAIA found only 220 basic solutions; multiplied by 32, we find again 7040. When N grows, the number of symmetries increases at a startling rate: for N=10, CAIA finds 15,359 symmetries! On this point, CAIA’s behavior is flawless, although a lot of computer time is needed: 31,193 seconds for all the symmetries when N=10.

In the following step, CAIA tries to find other constraints. For the magic problem, it adds that the sum of the unknowns is equal to the sum of the first N2 integers, since there is a bijection. After that, it can find the value of SUM. For us, humans, this is easy because we are seeing the square: it is clear that the sum of the N horizontal lines is N*SUM. It is also evident that this sum contains all the squares, so its value is the sum of all the numbers between 1 et N2, that is N2*(N2+1)/2. Therefore, we have SUM=N*(N2+1)/2.

I did not give CAIA this formula, it has to find the value of SUM from the set of constraints. For this task, it is rather good: for N=16, it finds SUM=2056, after many substitutions in the 34 initial constraints, plus the bijection constraint. It would have been better to prove the general formula, but it is difficult for a blind system, which cannot see the square as we do. It does not find other constraints, and I do not believe that constraints with less than N unknowns exist for this problem. In total, for this task, CAIA has done well overall, although it could not set out the general formula.

The last part is to find the solutions, and this is the point where CAIA fails: it has been designed for finding all the solutions of a problem. For this reason, it is very interesting to find symmetries. However, it makes no sense to find all the solutions when there are billions of them. In its normal working, it considers the formulation of the problem, and particularly it finds new constraints, easier to use than the initial ones. In some cases, (as for N=3), it directly finds the solution. When there is no new constraint, it is often better to stop trying to be clever, but to write cleverly a combinatorial program that generates all the solutions.

For a magic square, this works well for the small values of N: for N=8, it generates the first 1,000 solutions in 4 seconds. Unfortunately, the search space dramatically increases with N: the number of unknowns grows as N2, and the number of possible values for each unknown also grows as N2. Very soon CAIA is completely overcome. The method still works for N=9: 10 seconds for the first 1,000 solutions. Unfortunately, there is a sharp discontinuity for N=10: after 14 hours, it has not yet found a solution.

The reason for this failure is that it is unthinkable that, given the huge number of solutions, one can ever find all of them when N is not small. It is sufficient to find one of them. We could think that all we have to do is to launch the search for solutions, and stop when one is found. In reality, it is often better to use other methods for scanning the search space, pinpointing the areas where the probability of finding a solution is high. For instance, one can find solutions for the set of equations when the unknowns may have continuous values. Then, one can look for the combinations of integer values close to the values of these solutions.

Another heuristic may be used, such as to make the initial problem more constrained, adding new constraints so that it would be easier to find solutions for this new problem. If it has solutions, they would also be solutions of the initial problem. When there are zillions of solutions, we can add constraints that remove some of them. In a future post, we will see how this could be very efficient for finding solutions for some magic square problems.

To sum up, although a problem may have many solutions, CAIA experiences serious difficulties when all its constrains has many unknowns, each unknown having many possible values; this is normal, the search space is huge. In this situation, it can, nonetheless, have excellent results when it can generate new constraints with few unknowns; this happened with the magic cube, where it found constraints with only three unknowns. If this is impossible, it must use completely different methods, good for finding a needle in a huge haystack. CAIA improves the combinatorial search with a meta-combinatorial search generating new constraints; this is sometimes not enough for solving this kind of problem, some modules must be added to CAIA.

The Imitation Game, a film on Alan Turing

The main actor of a new film, The Imitation Game, is Alan Turing. This title is misleading: we could think that this film was based on the Turing test, although it is only referred in passing. In reality, this film is about Turings part, during ww2, for decrypting Enigma messages. This film could be criticized for increasing the importance of Turing in breaking this unbreakable cipher. His part was essential, but other participants were also essential, and we can find little mention of them. Turing is almost absent in Robert Harris’ book on the same subject. Be that as it may, seeing this film is enjoyable.

This film is based on one of the three important results achieved by Turing: his role in breaking Enigma. The title reminds us of the second result, the Turing test. Here, I want to speak of the third result: the Universal Turing Machine.It is only implicitly mentioned in the film: Turings commanding officer mentions that the title of one of his papers contains a word impossible to pronounce. Without any doubt, he refers to the word Entscheidungsproblem“, which means “decision problem”. It appears in the title of his thesis, where he describes his machine. This word comes from the list of 23 mathematical problems published in German by Hilbert in 1900: it was the tenth one.

In his thesis, Turing states that his machine is limited: many problems are undecidable. This means that no general program can always find a solution for these problems. For instance, it cannot determine for any program whether it will ever halt, or print the number ’0′.

An interesting question is to determine whether this machine is as powerful as our modern computers. I am not interested in its efficiency: Turing Machine lacks many useful components, which can be simulated at the cost of a lot of computer time. For this theoretical machine, we are not interested in computer time, only in what it can do, no matter the elapsed time. It seems that Turing Machine is not more powerful than our computers: one can simulate its operations on a computer. However, although the tape used for storing symbols is finite, Turing assumes that his machine is always supplied with as much tape as it needs. Evidently, this is not true for our computers, but, given the size of their present memories, it is not very restrictive.

It is not so easy to answer the opposite question: are our computers more powerful than Turing Machines? As said before, we do not consider their efficiency, only whether a computer can perform some task impossible for a Turing Machine. I have found two of them, the first task being the computation of sequences of random numbers.

Random numbers are useful for many applications. They often allow to define a progressive scan of the search space effectively and efficiently. For instance, CAIA uses them for creating new problems. In these cases, it is not necessary to have true random numbers, pseudo-random numbers are adequate. They are generated by an algorithm, which gives a sequence of numbers with the wanted dispersion properties.

However, for few applications, in games or cryptography, genuine random numbers are needed. This happens when a system has an opponent who must not be able to predict its future behavior. In this case, pseudo-random numbers may be hazardous: the opponent will foresee all the decisions taken by the machine if he knows the algorithm generating these numbers. Turing Machines can generate pseudo-random numbers, but they are unable to generate true random numbers. Note that human beings are not very good when they have to generate random numbers.

Several researchers, around the year 1960, suggested to add an instruction that generated a random number each time it was executed. The idea was to use physical phenomena. This was not new, Mozart used dices for the random numbers, necessary for his minuet composition method. For the computers, they wanted to use the thermal noise of a vacuum tube. I do not remember if this idea was implemented, as random numbers were sufficient for most applications. Moreover, it would be very difficult to debug such programs. For the present time, RdRand generates random numbers, and it is supported by some CPU. Therefore, it is possible to have computers with true random generators, and Turing Machines cannot do it.

The existence of this possibility does not destroy Turing’s results: with more unpredictable computers, solving the decision problem becomes evermore difficult.

In summary, our computers can perform a task that Turing Machines cannot do. Programs using true random numbers have an interesting property: even if someone has such a program, and its data, he cannot simulate its future behavior. This can be very useful in all the situations where an opponent would like to foresee the forthcoming actions.

CAIA surprised me Part III The results

 After finding symmetries, CAIA tries to find solutions to the magic cube problem, applying rules that enrich its formulation.

One of these rules states that, when there is a bijection between a set of unknowns and a set of values, the sum of the unknowns equals the sum of the values. Therefore, CAIA adds the constraint:

F(1)+F(2)+f(3)+…..+F(26)+F(27)=1+2+3+…..+26+27=378

When it substitutes the first three constraints in the preceding one, it sees that 3*VAL=378, so:

VAL=126

Using this value, and after many substitutions in the constraints, it finds several constraints with four unknowns, such as:

F(5)+F(23)=F(13)+F(15)

Later, it discovers a constraint with three unknowns:

 F(5)+F(14)+F(23) = 42

Using the symmetries, it immediately adds two other constraints with three unknowns:

F(11)+F(14)+F(17) = 42

F(13)+F(14)+F(15) = 42

All the other symmetries lead to one of the three preceding constraints.

Finally, it happens that it has no more rule to apply. CAIA can choose between two possibilities:

1. It chooses an unknown, and it examines what happens when it substitutes it by one of its possible value. It carries on this task, instantiating an unknown, then generating new constraints, until it finds a contradiction or a solution. In both cases, it backtracks, instantiating the last unknown by the following possible value. CAIA is not very fast: it finds a solution every three seconds, taking into account that a new solution immediately leads to 47 other solutions, using the symmetries.

2. It writes a combinatorial program, cleverly using the constraints that it has found. This C program has 280 lines. CAIA compiles it, and executes it. Then, it finds 130 solutions every second.

In both cases, it chooses to begin backtracking with F(14), the small cube in the center of the large one. There are many solutions for each of the 27 possible values for F(14).

Here is one of these solutions:

 

 Its success is mainly due to its analysis of the problem formulation: it finds symmetries, and collects useful new constraints, particularly the three constraints with three unknowns. I did not know them myself, I had only found constraints with four unknowns. Naturally, when one knows that they exist, it is easy to prove them.

A human researcher, who would specifically solve this problem, could realize a more efficient system. However, the intelligence would be in his head, not in his system.

 I wanted to compare CAIA with other problem solvers. Francis Sourd kindly gave this problem to two of the best general solvers for problems defined by constraints. ILOG Solver did not find a solution in a reasonable time, while ILOG CEPLEX found the first one after 14 hours. Incidentally, it shows that CEPLEX is excellent: the search space is enormous when it is impossible to manipulate the formulation of the problem. Therefore, it does not know the value of VAL, and all of its constraints have 10 unknowns.

CAIA solved this problem in a way that I did not know. As Pégoud with his plane, we must let our systems fend for themselves!

 I doubt that, at the moment, there exists a general system that could approach the performances of CAIA for this problem, when it is placed in the same conditions: knowing only to find a bijection satisfying the 15 constraints given in part I. If I am wrong, I would be very interested in knowing more on this system.

CAIA surprised me Part II The symmetries

 For the cube problem described in the preceding post, CAIA begins with looking for symmetries in the problem formulation. What is a symmetry? For Richard Feynman: « A thing is symmetrical if there is something we can do to it so that after we have done it, it looks the same as it did before. »

There are many kinds of symmetries. CAIA only tries to find one of them: a permutation of the unknowns leads to the same set of constraints when each unknown is replaced by the corresponding one.

To do this, I defined a combinatorial problem: finding all the permutations of the unknowns that satisfy the preceding condition. This problem is given in the formalism used for all the combinatorial problems solved by CAIA. The interest of the bootstrap clearly appears here: knowing how to solve combinatorial problems allows us to solve problems necessary for solving combinatorial problems. For every new problem, CAIA solves a preliminary problem: finding which symmetries exist. Its data are the initial constraints of the main problem.

We are not restricted to the case where each constraint C is transformed in itself by the permutation, such as when one permutes A and B in the constraint A+B>0. Usually, most constraints are transformed into other ones.

For the cube problem, CAIA finds 47 solutions: for each solution that it finds, it can immediately add 47 new solutions.

One of these solutions is the following permutation:

 All the classical geometrical symmetries are among these 47 solutions: to the center, to the mean planes, etc. However, it found them from the constraints: it does not know geometry; the method can be used for problems where there is no geometrical interpretation. For instance, if unknowns A and B appear only in the constraint A+B=1,000,000 exchanging A and B gives the same constraint. Naturally, CAIA can often add new constraints for avoiding to generate a solution and its symmetrical ones. In the preceding example, if the values of A and B are different, adding constraint A<B will remove all the symmetrical solutions.

For three main reasons, CAIA needs finding symmetries:

1. This decreases the size of the search space, because constraints are added to prevent finding symmetrical solutions.

2. This increases the speed of the resolution process: for every new solution, one can immediately generate all its symmetrical solutions. For the cube, the system is 48 times faster once it has found the symmetries.

3. It is easier to create constraints: for each new constraint, CAIA adds all the constraints that are obtained by symmetry. Let us notice that it does not generate 47 other constraints for each constraint: as few unknowns are in a constraint, many of the constraints generated in that way are identical.

 

If one believes to help the system by adding constraints to the initial formulation, this is a mistake: this may hide some symmetries. For instance, one may be proud to have proven the constraint F(17)+F(11)=F(5)+F(23). Unfortunately, adding this constraint in the initial formulation will remove most of the symmetries, including the one given above. Indeed, the constraint that would be created with this symmetry is F(11)+F(17)=F(15)+F(13); it is not among the initial constraints of this new formulation. As it often happens, when one has a clever system, it is not good to try to help it: one may complicate its task. Let us trust CAIA to do its job.

Finally, for me, the main interest of finding symmetries is to show concretely that bootstrapping is possible, and can be very useful: realizing a solver for problems defined by a set of constraints improves the solver itself.

In the next post, we will see how CAIA uses these symmetries for solving the magic cube problem.

CAIA surprised me Part I

 An AI researcher can always arrange his system so that it can find a solution that he already knows. To do that, he chooses a formulation well adapted to this solution, he includes all the methods necessary for this solution, and he indicates to avoid the roads that do not lead to the right direction. For this reason, it is always interesting to see a system finding a solution unknown by its author.

 We have already seen how ALICE, developed by Jean-Louis Laurière, had immediately solved a problem proposed by professor Schutzenberger. I will now describe how CAIA has solved a problem better than I would have done.

 Let us consider a kind of magic cube: the goal is to find a bijection F between the 27 small cubes that are in a 3x3x3 large cube, and the integers between 1 and 27. The sums of the planes containing nine cubes must have the same value VAL. There are 15 such planes: 3 horizontal, 6 vertical, and 6 diagonal. The small cubes are numbered in the following way, the first array is for the upper face, and the third one for the lower face, as shown below:

 

 

 

 

 

The three constraints for the horizontal planes are:

F(1)+F(2)+F(3)+F(4)+F(5)+F(6)+F(7)+F(8)+F(9) = VAL

F(10)+F(11)+F(12)+F(13)+F(14)+F(15)+F(16)+F(17)+F(18)=VAL

F(19)+F(20)+F(21)+F(22)+F(23)+F(24)+F(25)+F(26)+F(27)=VAL

The six constraints for the vertical planes:

F(1)+F(2)+F(3)+F(10)+F(11)+F(12)+F(19)+F(20)+F(21)=VAL

F(4)+F(5)+F(6)+F(13)+F(14)+F(15)+F(22)+F(23)+F(24)=VAL

F(7)+F(8)+F(9)+F(16)+F(17)+F(18)+F(25)+F(26)+F(27)=VAL

F(1)+F(4)+F(7)+F(10)+F(13)+F(16)+F(19)+F(22)+F(25)=VAL

F(2)+F(5)+F(8)+F(11)+F(14)+F(17)+F(20)+F(23)+F(26)=VAL

F(3)+F(6)+F(9)+F(12)+F(15)+F(18)+F(21)+F(24)+F(27) = VAL

 

The last six constraints are for the diagonal planes:

 F(1)+F(2)+F(3)+F(13)+F(14)+F(15)+F(25)+F(26)+F(27)=VAL

 F(7)+F(8)+F(9)+F(13)+F(14)+F(15)+F(19)+F(20)+F(21)=VAL

 F(1)+F(4)+F(7)+F(11)+F(14)+F(17)+F(21)+F(24)+F(27) = VAL

 F(3)+F(6)+F(9)+F(11)+F(14)+F(17)+F(19)+F(22)+F(25)=VAL

 F(1)+F(5)+F(9)+F(10)+F(14)+F(18)+F(19)+F(23)+F(27)=VAL

 F(3)+F(5)+F(7)+F(12)+F(14)+F(16)+F(21)+F(23)+F(25)=VAL

In this old version of CAIA, I have indicated that the value of VAL was between 1 and 300. This was a very small aid; in the future versions, telling that VAL is positive will be enough.

I have already drawn attention to the importance of the problem formulation. To sum up, CAIA knows that it must find a bijection of 27 unknowns on the set of integers from 1 to 27, and the value of the positive unknown VAL, less than 300, such that the 15 preceding constraints are satisfied.

One could think that this is an easy problem, since it is under-constrained: there are 28 unknowns and only 15 constraints, plus the fact that one has a bijection. It has millions of solutions. However, the search tree is very large since every F has 27 possible values, and 300 for VAL. Moreover, each constraint contains 10 unknowns, and one must instantiate 9 of them to get a value for the tenth: many instantiations have to be made before one sees a contradiction. From this point of view, the problem is extremely difficult to solve: there are many solutions, but they are hidden in billions of possible instantiations for the unknowns.

We will see in the following posts how CAIA solved this problem, beginning with the discovery of its symmetries.