Superintelligence is the title of a book written by Nick Bostrom. This book is enjoyable to read, the author is well aware of what has been done in AI. For instance, he considers bootstrap as one of the possible ways for creating superintelligence, he mentions the crossover point where the intelligence of the improving system will become greater than human intelligence. Moreover, quoting Yudkowski, he notices that our intelligence is in a very narrow range, in spite of “the human tendency to think of “village idiot” and “Einstein” as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds in general.”

I completely agree with the author on the possibility of a superintelligence. However, I do not see the future as he does; in fact, I do not see the future at all. While reading this book, I had the feeling that the author had never realized any AI system; from his biography, it seems that it is true. He presents the future of AI in a very loose and undefined way: he does not give specific information about how to do it. I am very well placed to know that it is not enough to say that one must bootstrap AI, it needs to be indicated how to do this effectively, and it is not that easy. I have plenty of ideas for advancing CAIA, but I have not a clue about their final consequences.

One cannot fully appreciate what will become a particular AI system, especially if one has never realized such a system. In its final stage, it bears little resemblance with its initial version. We must be humble about predicting future consequences of a good idea.

In order to understand better the difficulty of predicting the future of a field of research, just look at computer science. I began working in this domain in 1958; at that time, people thought that it was excellent for computing trajectories or for issuing payrolls (incidentally, my first program solved an integral equation). In these years, I had never read a paper foreseeing computer omnipresence in the future world, where everybody uses several computers, even in the phones, the cars and the washing machines, and where a thing such as the web could exist. In the same way, we have no idea of AI future application: we are not smart enough, and AI is still in a rudimentary state. This does not allow us to have a reasonable idea of superintelligence.

Many are concerned that machines will size power over man because we assume that they will behave as ourselves. However, our intelligence was built for our hunter-gatherer ancestors by evolution, which promotes procreation and search for food. Sizing power is an important step for these activities, no wonder that it is widely displayed among living beings, including aliens if they exist, who are improved by evolution. The goals of super-intelligent beings are not necessarily the same as ours; therefore, the search for power is not always essential for their activity. Perhaps, there will be artificial super-mathematicians, quite happy to develop complicated theories; to do that, there is no need to rule the world and to enslave us.

In his book, Bostrom considers various methods for not being dominated by artificial beings, such as boxing methods that confine the system, or tripwires detecting signs of dangerous activity. However, if we confine the system, it will never become super-intelligent; furthermore, if it is super-intelligent, it will detect and bypass our tripwires. In the March 2015 issue of the AI Journal, Ernest Davis analyses this book, and suggests that the issue will be settled by “an off switch that it cannot block.” This is a variant of the idea of unplugging the machine from the mains; Science Fiction has long shown that it does not work. Even for AI specialists, it is difficult to imagine the immense superiority of superintelligence!

Let’s go back to the general scale of minds, where animals are well below us, even those rather intelligent such as dogs. Therefore, if we are able to realize a super-intelligent system, the difference between ourselves and this system could be as large as the difference between a dog and ourselves. For understanding our position vis-à-vis super-intelligent systems, we can get inspired by the position of a dog vis-à-vis ourselves. Considering again the efficiency of tripwires, might a dog invent an off switch that we could not bypass? How could a dog imagine many of our activities, such as writing, mathematics, astronomy, AI, and so on? No doubt that we also have no idea about many essential activities, as well as the ancient Greeks had no idea about Computer Science. Long into the future, super-intelligent systems will possibly discover new applications; we will be able to use them because it is much easier to use than to find.

Who knows, perhaps some day our descendants will harmoniously live with super-intelligent systems, and they will be very satisfied with this. When we see how the world is ruled, we can hope better for future generations. This may happen if super-intelligent beings do not want to control the world, but to help us. As we have said before, this is possible if superintelligence is not created by a kind of evolutionary process, based on the pursuit of power and control.

Anyway, we are very far away from this distant future. I am not sure that our intelligence is sufficient for allowing us to create superintelligence; however, if we succeed, the result will certainly be very different from all our predictions. Only when we have made further progress, we will be able to predict the consequences of super-intelligent beings accurately. In the meantime, we are playing at frightening ourselves.

Poor Artificial Intelligence: little known and little liked

 I have the sad honor of working in AI, a domain that is frowned upon by many remarkable people. We are dealing with two main categories of opponents. I have already spoken of the first ones, such as Stephen Hawking, who think that AI is potentially very dangerous: the emergence of super-intelligent artificial beings will lead to the extinction of the human species. On the contrary, for others it is impossible to create artificial beings more intelligent than ourselves. A recent interview shows that Gérard Berry belongs to this last category.

Gérard Berry is a computer scientist who has made outstanding works, deservedly recognized by the scientific community: he is a member of the Académie des Sciences, professor at the Collège de France, and he has received the CNRS gold medal, France’s most prestigious scientific distinction, the second time it honored a computer scientist. For me, AI is not a part of computer science, but both are very close: computer is an essential tool for implementing our ideas. Therefore, when a top-level computer scientist criticizes AI, this must be taken very seriously.

AI occupies only a small part of this interview, but he is not mincing his words in these few sentences. Here are two excerpts:

“I was never disappointed by Artificial Intelligence because I did not even believe for a second in Artificial Intelligence. Never.”

“Fundamentally, man and computer are the most different opposites that exist. Man is slow, not particularly rigorous, and highly intuitive. Computer is super-fast, very rigorous, and an absolute ass-hole.”

Firstly, I will consider three points mentioned in this interview, where I agree with him:

Intuition is certainly an essential characteristic of human intelligence. In a recent blog, I have looked at how AI systems could also have this capacity.

Chess programs are a splendid realization, but their intelligence is mainly the intelligence of their developers. They used the combinatorial aspects of this game, enormous for human beings, but manageable by fast computers. It would have been much more interesting to speak of another outstanding IBM achievement: Watson, an intuitive system, won again the best human players for the game Jeopardy!

For the present time, AI byproducts are the most useful results of our discipline: they allowed to discover important concepts in computer science.

Let us talk about our disagreements. Up to now, nobody has proven that man could have an intellectual activity that no artificial being could ever have. As long as we are in this situation, we must assume that all our activities can be mechanized. Then, we are in a win-win situation. If we show that we are wrong, a significant progress would be made by a reduction ad absurdum argument. On the contrary, if we are right, we will create very helpful artificial beings, which will not be restrained by our intellectual limits.

The main argument that appears in this interview is that computers are absolutely stupid. I am way back in 1960! At that time, many people already wanted to show the impossibility of intelligent artificial beings. Their argument was: computers only do what they are told to do. Naturally, this is true, but this proves nothing: the problem is to write programs, and to gather data, so that the computer will behave intelligently. One can develop programs that do other things than running as fast as possible on their data. Programs can analyze the description of a problem, and then write and execute efficient programs well adapted to a particular problem and its data. Moreover, in a bootstrap, the existing system works with its author for creating an improvement of the system itself. Computer users strongly believe in the usefulness of bootstrap: without computers, it would have been impossible to design the current computers! Hawking’s extraordinary intuition had seen the efficiency of bootstrapping AI; this is why he was afraid of its future.

If one has never proven that man can do something that no computer could ever do, many things can be done by a computer while no human being could ever do them. For instance, our reflexive consciousness is very limited: most of the processes in our brain are unconscious. On the contrary, it is possible to realize a computer system that can observe any of its processes if it wants to; moreover, it can have access to all of its data. As consciousness is an essential part of intelligence, this will have tremendous consequences. Unfortunately, we are not yet able to make full use of this capacity because man’s consciousness is too limited for being a useful model.

AI is probably the only scientific domain with so many staunch opponents, although they do not know it. This is not surprising: man has always rejected what would remove him from the center of the world. We all know the difficulties encountered by those who said that the earth revolved around the sun, or that apes were our cousins. Naturally, the idea that artificial beings could be more intelligent than ourselves is particularly unpleasant for the most intelligent among us. Less intelligent people are used to live with beings more intelligent than themselves, but geniuses are not. Of course, some of them are strongly against AI.

I believe in the possible existence of artificial beings much more intelligent than ourselves. On the other hand, I am not sure that we are intelligent enough to achieve this goal. We need help, and for this reason, since many years, I am trying to bootstrap AI: as of now, my colleague CAIA, an Artificial Artificial Intelligence Researcher, gives me a substantial assistance for its own realization. AI is probably the most difficult task ever undertook by mankind: it is no wonder that progress is very slow.

55 years of Artificial Intelligence

It was in October 1960 that I started to work on my thesis. One year ago, I was appointed deputy chief of a computer department, which existed since almost ten years, in a military laboratory. It had an impressive documentation on computers, and it was a very supportive environment: the same year, the head of the department started working on automatic translation from Russian to French. I was thrilled by the first AI realizations, such as Newell and Simon’s Logic Theorist, Samuel‘s machine learning using the game of checkers, Gelertner‘s geometry-theorem proving machine, and so on.

For my thesis, I wanted to implement a general theorem prover that received as data any of the many propositional logic axiomatizations. It had to discover interesting theorems in each theory, without any information on its existing theorems.

The first difficulty to overcome was to find a thesis director: at that time, Artificial Intelligence, and even computer science, were not taught at Paris University. Luckily, Professor Jean Ville was interested in computers, although he essentially worked on probability, statistics, and economics. He was very kind to accept that I registered at the University for this thesis.

Looking at the results of the initial version of my program, I was surprised to see that it had discovered proofs different from those given in logic books. These original proofs showed me that it could be interesting to use meta-theorems, that is new ways for proving theorems. Therefore, I gave my program the possibility to prove meta-theorems, and the modified program found more results, and also proofs that were easier to understand. The results of this program were not bad; for a particular axiomatization, it proved results for which Lukasiewicz said: “One must be very expert in performing such proofs.” Results found for one of these axiomatizations can be found at page 134 of Laurière’s book (page 125 for the French version).

I was greatly impressed by these results: since then, I have always tried to realize systems that have the ability to work at the meta-level. This is a challenging task, since their results are compared with systems where the work at the meta-level has been made by their author, and not by the system itself. For the present time, the performances of a system working at the meta-level are not better than those found by other programs, but human intelligence is less important, they have a larger degree of autonomy. The primacy of men over animals mainly comes from our capacity to work at a meta-level, consciousness is one example of this superiority. I cannot believe that it is possible to create a superintelligence without this ability.

Moreover, this ability allows to bootstrap AI: existing AI systems will help us to implement more powerful systems. I believe that AI is the most difficult science, perhaps far too complex for human intelligence. The ideal would be to have an Artificial Artificial Intelligence Researcher; this is CAIA’s long-term goal.

Since 30 years, I am developing CAIA. At the moment, it has 13,000 rules that transform themselves into 500,000 lines of C; I have not written a single line of the present programs, and many rules have also be written by CAIA. I continue to replace the expertises created by myself, by meta-expertises that create the preceding expertises. The bootstrap will be completed when CAIA includes a set of meta-expertises that could generate all its expertises and meta-expertises, including themselves.

I am trying to create more and more declarative, and more and more general knowledge: I prefer to give CAIA meta-knowledge for creating expertise E rather than writing myself expertise E. It is difficult but, when I succeed, CAIA’s version is usually better than my initial version.

There is still a very long way before this bootstrap is completed. I have not 55 more years for completing it, but I hope that other researchers will pursue this tremendously difficult task.

Is it possible to define a problem?

 We do not pay enough attention to the definition of the problems given to our solvers. Firstly, by selecting a tailored definition to our system, we can give it an unreasonable help. More importantly, ambiguities in the definition may lead to a different problem although we believe that they are always identical.

This is often due to the ambiguities of natural language. However, we will see that different problems come from the same definition because some people add constraints that they feel evident; nevertheless, we have the right to reject these constraints.

The definition of Sudoku does not seem to pose difficulties, it is very straightforward: one must complete a 9×9 grid with numbers from 1 to 9 so that each column, each row, and each of the nine 3×3 subregions contain all the digits from 1 to 9. This definition appears in the magazines that publish these problems. Moreover, a few add another constraint: there is only one solution for this puzzle.

The question is whether one can use this uniqueness constraint for solving Sudoku puzzles. This constraint is respected in all the published puzzles: they always have exactly one solution. However, this can be considered as a constraint only for those who create new puzzles. The requirement of a unique solution exist for many problems; for instance, a chess problem is cooked if there are several solutions. Even then, very difficult problems, such as the magic cube, described in a preceding post, has millions of solutions; as each one is very hard to find, the problem is interesting.

If this constraint must be respected by the author of a Sudoku puzzle, does a human or machine solver have the right to use it? I have never seen a chess problem where the solution uses this uniqueness constraint. Some constraints are necessary for creating a beautiful problem; this does not mean that they can be used for solving it. However, the first question that needs to be asked is: can this constraint be useful for solving Sudoku puzzles?

Of course, knowing the number of solutions of a problem may be very helpful: as soon as all the solutions have been found, one can stop the process. Indeed, no more solutions can be found if the author has not made a mistake. It is very easy to implement this possibility. For instance, a parameter indicates to CAIA the maximum number of solutions that it can find, so that it avoids wasting a lot of time when there are millions of solutions. The default value for this parameter is 50, but one has just to instantiate it to 1: CAIA will stop as soon as it has found one solution.

It is less obvious to use this restriction even before finding a solution. I will illustrate this point in a problem published in the Figaro Magazine. Far too often, after a newspaper has published a Sudoku problem, it only gives the solution: it does not indicate the steps that were taken for finding this solution. Fortunately, the Figaro Magazine also gives an excellent description of the main steps; therefore, it is possible to see which rules have been used. Usually, there are only the constraints given with the definition of the problem, but not the one stating the uniqueness of the solution. However, a problem stated in the 17 July 2015 issue uses it. The following diagram describes the situation after finding the value of three squares. The lines are numbered 1 to 9 from the top.

It is easy to see that the possible values for B2 and C2 are 7 and 9. In the same way: 1, 7, and 9 are the possible values for B5 and C5. (5 cannot be in C5 because in column C, 5 can must be in C7 or C9, the only squares where 5 can be put in the lower left subregion). If 1 were neither in B5, nor in C5, each solution with B5=7 and C5=9 would give a symmetrical solution where 7 and 9 are switched in squares B2, C2, B5, and C5. If there are N solutions with B5=7 in this hypothesis, there would be also N solutions with B5=9. If K is the number of solutions when 1 is in B5 or C5, the total number of solutions would be K+2*N. As the uniqueness constraint states that it is equal to 1, this means that K=1, and N=0: there is no solution without 1 in B5 or C5. This is very useful, that means that D5 is not 1. As in the center subregion 1 is either in D5 or F6, we have F6=1, and it is easy to complete the Sudoku with this information.

                                      A            B         C          D          E          F         G          H          I

 CAIA does not make this deduction: it has not received the uniqueness constraint. For this problem, it finds the same possible values for squares B2, C2, B5, and C5; unfortunately, it cannot deduce the following steps of the Figaro solution. In its solution, it sees that the only values for E1 are 2 and 5. 2 leads to a contradiction, and it directly finds the solution with E1=5. Incidentally, this proves that the value of N is actually zero. However, contrary to the solution given in the Figaro, it has to backtrack once.

I understand that many people want to use the uniqueness constraint, although I do not agree with them for several reasons:

*Real problems may have none or many solutions, and it is interesting to solve them. Why are they giving a special status to artificial problems?

*Even when there is certainly one solution, why transform the search for a solution into a game of chance. The first to find the solution is the lucky one who considers firstly the good value for the unknowns.

*For other problems that have only one solution, as Chess problems, one never uses this uniqueness for finding this solution.

*In many cases, including the Figaro, the given Sudoku definition does not include this constraint.

*If the problem is cooked, nobody will see it.

*The author of the problem must check that it has just one solution: the uniqueness constraint is certainly necessary for creating a puzzle. Naturally, for checking it, he cannot use the fact that the problem has just one solution! Therefore, one must also have a solver that does not use this constraint. Is it necessary to have two solvers for the same problem?

All this shows that, even when the definition of a problem looks evident at first sight, it may be difficult to agree on this definition: the solvers cannot help but interpret this formulation. In this case, the solutions and the solving methods may not be the same as those given in the initial formulation: they do not solve the same problem. Therefore, if it is possible to define a problem, it is certainly not easy. Unfortunately, AI researchers have to give unambiguous definitions to their problem solvers.


 For the present time, AI systems are worse than ourselves for some applications, and I believed that the reason was that they have not an associative memory as good as ours; its efficient use is essential for our intuition. In a fraction of a second, it gives us an answer, or allows us to remove an ambiguity. We have already considered this problem in the preceding blog. AI systems rarely use a large corpus of English texts effectively, and don’t do it quickly.

Therefore, I was impressed by the results achieved by Watson, an AI system developed by an IBM team. It can play a Jeopardy! competition, a television game very popular in the States. Several competitors must find as fast as possible the person or the object corresponding to a clue, which is an English sentence in any domain. Watson must first understand the sentence, then find candidate answers in a huge amount of English texts, and finally choose the best match with the clue. All this has to be done in a short time, a few seconds.

Let us consider an example from the match that showed Watson’s superiority. For the clue It’s a poor workman who blames these.“, Watson was the first to find the good answer: tools.

The difficulty comes for three main factors: questions come from a broad domain, the answer must be given with high confidence, and it must be done very fast. Watson does not use Internet, but it has a great deal of English knowledge, including several encyclopedias, among them the full text of Wikipedia; the whole takes four terabytes of storage.

Watson competed against two top champions: Brad Rutter, a 20-time champion, and Ken Jennings, the best player in Jeopardy! history, famous for winning 74 games in a row. Watson won, and won big: $77.147, when the other two only won $21.600 and $24.000.

As Watson includes many modules, I will briefly speak of those that seem the most important for an AI point of view.

In the first step, Watson analyses the clue and extracts keywords and sentence fragments used for finding possible answers from the breadth of its knowledge. This mechanism is a little like our intuition: it is using many heuristics. Its main advantage is that it is very fast. Unfortunately, for us, this often leads to mistakes, because we often merely keep the first response, for lack of time, laziness, or unawareness of the knowledge leading to the result.

For Watson, this step usually gives several possible candidates. Then, far better than us, it will spend a lot of time (that means a few seconds for a computer!) for choosing the most reliable one. It ranks the candidates, and it chooses the first one, provided that it has a sufficient level of confidence. If no one is satisfactory enough, it does not answer.

Many methods enable it to measure its confidence in a particular result. If a result appears several times, Watson will be much more confident in this result. It also tries to evaluate the reliability of a candidate. If the clue is Chile shares its longest land border with this country, it is easy to remove China, which does not share borders with Chile. Two serious candidates are Argentina and Bolivia. The media often speak of the border dispute between Chile and Bolivia; this will tend to favor Bolivia. On the other hand, some results give the lengths of the borders. As these lengths are very different, and these results are rather reliable, Argentina will finally be chosen by the geographic module. However, even when a candidate has most of the wanted characteristics, it will not be chosen if one reliable result forbids it.

Watson may analyze several results and compare their reliability because it is aware of the information used for finding a particular answer. This is a huge advantage for artificial cognition, usually we do not know why our intuition gives a particular result.

Watson correctly answered several difficult clues, for instance clock for Even a broken one of these on your wall is right twice a day.“, and escalator for This clause in a union contract says that wages will rise or fall depending on a standard such as cost of living.

Nevertheless, as human beings, Watson sometimes makes mistakes. The audience had a good laugh when it answered Toronto to the following clue on US towns: Its largest airport is named for a World War II hero; its second largest for a World War II battle.

Realizing a system playing Jeopardy! has shown the possibility of using a huge memory efficiently. Now, the authors of this system want to adapt their methods to the resolution of another problem: medical diagnosis. In that domain, it is also important to give the right answer by reasoning over an unstructured natural language content; moreover, it will have much more useful consequences.

Cows are thirsty

 Several years ago, I was in the subway with one of my children. For keeping him entertained, I asked him a question: “what are the cows drinking?” Knowing me, he did not answer immediately: he sniffed out a trap, it was too easy to answer this question! A passenger, whom I did not know, was surprised by this total ignorance, and he could not refrain from saying: “They are drinking milk, my little child!” This mistake is natural: this is a hoax. It is due to a very useful mechanism, that often enables us to give a correct answer quickly. The question was on a beverage and on cows; as cows are associated with milk, and milk is a beverage, we have the answer immediately.

It is better to avoid systems giving incorrect results. However, if the result is that one gives no answer at all, this is not satisfactory. Many AI researchers have an excellent mathematical training. Therefore they overwhelmingly reject all possibilities of error: if one accepts a contradiction, one can prove everything. Luckily, human beings survive their contradictions.

Methods that can lead to erroneous results may be useful when we do not know a perfect method, or when it is too costly over time. It may be better to have a result quickly rather than waiting centuries for a perfect result. For instance, current Chess programs play very well, better than the world champion, but they certainly do not always play the best move. This was an issue for Edgar Poe, when he was asserting that Maelzel’s Chess Player was not an automaton. He was right: a human chess player was hidden into the machine. Nevertheless, one of his arguments was incorrect: he believed that it was not a machine, because sometimes it lost the game. Sorry, Edgar, a machine may be imperfect. However, we know a perfect method for playing the best chess move: one fully develops the tree of all the possible moves for each player. As the depth of the tree is finite, it can be done in a finite time, and one can find the perfect solution from this tree. Unfortunately, the time necessary for generating it goes beyond anything we can imagine.

Fast mechanisms are used by our unconscious; we call them intuition. Unfortunately, sometimes they are making mistakes. They are also useful for resolving ambiguities in natural language texts. It may be dangerous to use the results found by our intuition: we must cautiously check that they are correct when the consequences could be serious. To do that, our conscious mechanisms verify the results given by our intuition. However, calling them is not automatic: for each result, we must decide whether we will check it.

Intuition is an essential mechanism for human beings, and especially for animals: this is the only way enabling them to take decisions that are not genetically programmed. It always gives results quickly; moreover, we can use it in situations where we do not know other methods. As living beings, artificial beings will have to use similar methods. Nothing prevents us to give them such capabilities. Unfortunately, this is not so easy because we do not know them, since they are unconscious; we have to discover and experiment them in our systems. Another difficulty is that our love of correct methods will reject the bunch of heuristics that makes up intuition.

Fortunately, we have now a clearer idea of the mechanisms called ‘intuition’. For Herbert Simon: “The situation has provided a cue; this cue has given the expert access to information stored in memory, and the information provides the answer. Intuition is nothing more and nothing less than recognition.” I cannot see why AI systems could not use this mechanism.

The horror of error is dangerous: some research directions are neglected because they can lead to dubious results. This favors rigorous mathematical methods in AI, even when their application is limited because they are too time-consuming.

If artificial beings were never making mistakes, that would mean that they would be too restricted. They must not be refrained from using methods that can be very helpful when carefully used. However, we must also give them the possibility to find out when they are wrong: we have to add modules verifying the results. For instance, CAIA checks that all the solutions it finds satisfy all the constraints. Unfortunately, it is possible that it misses solutions, but we have already seen that, sometimes, great mathematicians also miss some of the solutions.

We will progress when our systems believe that cows drink milk, see that this is wrong, then find the right answer, as my embarrassed fellow traveler finally did.

When a Belgium canon meets a French mathematician

 Let us consider again magic squares, I want to show that it is possible to find solutions for very large squares, using methods completely different from the improved combinatorial method used by CAIA in the preceding post.

 In 1703, canon Poignard, from Brussels, found an algorithm that generates solutions for magic squares when N is odd. This method was improved by mathematician Philippe de la Hire, who issued in 1705 a communication to the Académie Royale des Sciences. This method is well known; it was already described in Diderot’s Encyclopedie, entry ‘quarrés magiques’.

 For describing this method, I will consider N=5. Let us choose a permutation of the numbers from 1 to N, for instance: 2,3,5,1,4. We write it on the first line of the square. For the second line, we put this permutation shifted by P elements; P is a number less than N such that P and P-1 must be relatively prime numbers with N. This implies that the method cannot be used for even squares: as either P or P-1 is even, N cannot be even. If one chooses P=3, the second line will be 5,1,4,2,3. The process continues, and we get the square just down left.

In the same way, we consider a permutation of the integers from 0 to N-1, multiplied by N, for instance: 5,20,0,15,10 which is written on the first line of another square. This new square is completed as the preceding one, with a shift Q, which must also be different from P. With Q=4, we generate the square on the right above.

 Both squares have exactly once each number on each line, each column, and each diagonal; moreover, the sums on these lines are the same, since we have the same permutation. Naturally, there is no bijection, since there are only N different numbers for the N2 little squares of each square.

We create the magic square by adding these two squares. It has always all the preceding characteristics, and each value between 1 and N2 is present exactly once.

 An interesting remark: if opposite sides of the square are considered as next to one another, all the diagonals have also the common value for their sum. We do not generate only magic squares, but a kind of ‘hyper-magic’ square that has 4N constraints, including N diagonals to the right and N diagonals to the left. The new problem is much more constrained than the original one. This explains why there is no hyper-magic square for N=3, where one cannot find a couple P and Q.

This way of generating magic squares is extraordinary: we do not need to write the constraints, the way the solution is created automatically satisfies them. It is no more necessary to find the value of SUM. It is also unnecessary to backtrack.

 Without using a computer, I found a magic square for N=11 in 16 minutes. Furthermore, I wrote a 25 lines C program that generates such squares. In only 333 seconds, it generated (but not stored) a solution for N=100,001. It had to find a value for 10,000,200,001 little squares! It is unthinkable to solve such a problem with the methods used by CAIA, and many other constraint satisfaction systems: there are 200,002 constraints, each one has 100,002 unknowns, and the number of possible values for each unknown is greater than ten billion. Studying the formulation of this problem, canon Poignard transformed a constraint satisfaction problem into an algorithm building a large permutation from simpler permutations.

 CAIA’s problem solver is very far from such performances. This does not mean that it is an impossible task for an AI system, but that our systems must have a greater capacity for working at the meta level, studying and transforming the formulation of the problem: they must be able to do what canon Poignard did more than 300 years ago.

CAIA is not a good magician

 Too often, IA researchers describe the successes of their system, but they keep a lower profile when they get poor outcomes. I do not want to comply to this bad habit. I describe here CAIA’s behavior when success does not come, in the present case, the magic square problem. An order N Magic Square has N horizontal lines, and N vertical lines. Therefore, there are N2 squares, and each one contains a number between 1 and N2, so that each number appears only once. Moreover, the sum of the verticals, the horizontals, and the two diagonals, must have the same unknown value, SUM.

First, CAIA finds symmetries, which are permutations of the unknowns which lead to an identical set of constraints. Seven symmetries are evident for all magic squares: three rotations and four reflections with respect to horizontal, vertical, and both diagonals. CAIA knows nothing about geometry, but it can find all symmetries, both geometric and non-geometric. To do that, it uses the general component that solves constraint satisfaction problems. I had only to specify one particular problem: finding all the permutations of the unknowns leading to the initial set of constraints.

I hoped that it would find the seven geometrical symmetries. However, CAIA was far beyond any of my expectations: it found a huge number of symmetries that I never imagined existed. Even for N=4, it finds 31 symmetries! Some are completely unexpected, for instance, one of them transforms the first horizontal into the second vertical: both contain the same values, naturally not in the same order.


Finding this is not evident: almost 400 years ago, Bernard Frénicle de Bessy performed an outstanding study of the N=4 magic square. In particular, he defined a standard representation of the solutions, which is always used. However, he has found 880 basic solutions; multiplied by 8, this gives a total of 7040 solutions. CAIA found only 220 basic solutions; multiplied by 32, we find again 7040. When N grows, the number of symmetries increases at a startling rate: for N=10, CAIA finds 15,359 symmetries! On this point, CAIA’s behavior is flawless, although a lot of computer time is needed: 31,193 seconds for all the symmetries when N=10.

In the following step, CAIA tries to find other constraints. For the magic problem, it adds that the sum of the unknowns is equal to the sum of the first N2 integers, since there is a bijection. After that, it can find the value of SUM. For us, humans, this is easy because we are seeing the square: it is clear that the sum of the N horizontal lines is N*SUM. It is also evident that this sum contains all the squares, so its value is the sum of all the numbers between 1 et N2, that is N2*(N2+1)/2. Therefore, we have SUM=N*(N2+1)/2.

I did not give CAIA this formula, it has to find the value of SUM from the set of constraints. For this task, it is rather good: for N=16, it finds SUM=2056, after many substitutions in the 34 initial constraints, plus the bijection constraint. It would have been better to prove the general formula, but it is difficult for a blind system, which cannot see the square as we do. It does not find other constraints, and I do not believe that constraints with less than N unknowns exist for this problem. In total, for this task, CAIA has done well overall, although it could not set out the general formula.

The last part is to find the solutions, and this is the point where CAIA fails: it has been designed for finding all the solutions of a problem. For this reason, it is very interesting to find symmetries. However, it makes no sense to find all the solutions when there are billions of them. In its normal working, it considers the formulation of the problem, and particularly it finds new constraints, easier to use than the initial ones. In some cases, (as for N=3), it directly finds the solution. When there is no new constraint, it is often better to stop trying to be clever, but to write cleverly a combinatorial program that generates all the solutions.

For a magic square, this works well for the small values of N: for N=8, it generates the first 1,000 solutions in 4 seconds. Unfortunately, the search space dramatically increases with N: the number of unknowns grows as N2, and the number of possible values for each unknown also grows as N2. Very soon CAIA is completely overcome. The method still works for N=9: 10 seconds for the first 1,000 solutions. Unfortunately, there is a sharp discontinuity for N=10: after 14 hours, it has not yet found a solution.

The reason for this failure is that it is unthinkable that, given the huge number of solutions, one can ever find all of them when N is not small. It is sufficient to find one of them. We could think that all we have to do is to launch the search for solutions, and stop when one is found. In reality, it is often better to use other methods for scanning the search space, pinpointing the areas where the probability of finding a solution is high. For instance, one can find solutions for the set of equations when the unknowns may have continuous values. Then, one can look for the combinations of integer values close to the values of these solutions.

Another heuristic may be used, such as to make the initial problem more constrained, adding new constraints so that it would be easier to find solutions for this new problem. If it has solutions, they would also be solutions of the initial problem. When there are zillions of solutions, we can add constraints that remove some of them. In a future post, we will see how this could be very efficient for finding solutions for some magic square problems.

To sum up, although a problem may have many solutions, CAIA experiences serious difficulties when all its constrains has many unknowns, each unknown having many possible values; this is normal, the search space is huge. In this situation, it can, nonetheless, have excellent results when it can generate new constraints with few unknowns; this happened with the magic cube, where it found constraints with only three unknowns. If this is impossible, it must use completely different methods, good for finding a needle in a huge haystack. CAIA improves the combinatorial search with a meta-combinatorial search generating new constraints; this is sometimes not enough for solving this kind of problem, some modules must be added to CAIA.

The Imitation Game, a film on Alan Turing

The main actor of a new film, The Imitation Game, is Alan Turing. This title is misleading: we could think that this film was based on the Turing test, although it is only referred in passing. In reality, this film is about Turings part, during ww2, for decrypting Enigma messages. This film could be criticized for increasing the importance of Turing in breaking this unbreakable cipher. His part was essential, but other participants were also essential, and we can find little mention of them. Turing is almost absent in Robert Harris’ book on the same subject. Be that as it may, seeing this film is enjoyable.

This film is based on one of the three important results achieved by Turing: his role in breaking Enigma. The title reminds us of the second result, the Turing test. Here, I want to speak of the third result: the Universal Turing Machine.It is only implicitly mentioned in the film: Turings commanding officer mentions that the title of one of his papers contains a word impossible to pronounce. Without any doubt, he refers to the word Entscheidungsproblem“, which means “decision problem”. It appears in the title of his thesis, where he describes his machine. This word comes from the list of 23 mathematical problems published in German by Hilbert in 1900: it was the tenth one.

In his thesis, Turing states that his machine is limited: many problems are undecidable. This means that no general program can always find a solution for these problems. For instance, it cannot determine for any program whether it will ever halt, or print the number ’0′.

An interesting question is to determine whether this machine is as powerful as our modern computers. I am not interested in its efficiency: Turing Machine lacks many useful components, which can be simulated at the cost of a lot of computer time. For this theoretical machine, we are not interested in computer time, only in what it can do, no matter the elapsed time. It seems that Turing Machine is not more powerful than our computers: one can simulate its operations on a computer. However, although the tape used for storing symbols is finite, Turing assumes that his machine is always supplied with as much tape as it needs. Evidently, this is not true for our computers, but, given the size of their present memories, it is not very restrictive.

It is not so easy to answer the opposite question: are our computers more powerful than Turing Machines? As said before, we do not consider their efficiency, only whether a computer can perform some task impossible for a Turing Machine. I have found two of them, the first task being the computation of sequences of random numbers.

Random numbers are useful for many applications. They often allow to define a progressive scan of the search space effectively and efficiently. For instance, CAIA uses them for creating new problems. In these cases, it is not necessary to have true random numbers, pseudo-random numbers are adequate. They are generated by an algorithm, which gives a sequence of numbers with the wanted dispersion properties.

However, for few applications, in games or cryptography, genuine random numbers are needed. This happens when a system has an opponent who must not be able to predict its future behavior. In this case, pseudo-random numbers may be hazardous: the opponent will foresee all the decisions taken by the machine if he knows the algorithm generating these numbers. Turing Machines can generate pseudo-random numbers, but they are unable to generate true random numbers. Note that human beings are not very good when they have to generate random numbers.

Several researchers, around the year 1960, suggested to add an instruction that generated a random number each time it was executed. The idea was to use physical phenomena. This was not new, Mozart used dices for the random numbers, necessary for his minuet composition method. For the computers, they wanted to use the thermal noise of a vacuum tube. I do not remember if this idea was implemented, as random numbers were sufficient for most applications. Moreover, it would be very difficult to debug such programs. For the present time, RdRand generates random numbers, and it is supported by some CPU. Therefore, it is possible to have computers with true random generators, and Turing Machines cannot do it.

The existence of this possibility does not destroy Turing’s results: with more unpredictable computers, solving the decision problem becomes evermore difficult.

In summary, our computers can perform a task that Turing Machines cannot do. Programs using true random numbers have an interesting property: even if someone has such a program, and its data, he cannot simulate its future behavior. This can be very useful in all the situations where an opponent would like to foresee the forthcoming actions.

CAIA surprised me Part III The results

 After finding symmetries, CAIA tries to find solutions to the magic cube problem, applying rules that enrich its formulation.

One of these rules states that, when there is a bijection between a set of unknowns and a set of values, the sum of the unknowns equals the sum of the values. Therefore, CAIA adds the constraint:


When it substitutes the first three constraints in the preceding one, it sees that 3*VAL=378, so:


Using this value, and after many substitutions in the constraints, it finds several constraints with four unknowns, such as:


Later, it discovers a constraint with three unknowns:

 F(5)+F(14)+F(23) = 42

Using the symmetries, it immediately adds two other constraints with three unknowns:

F(11)+F(14)+F(17) = 42

F(13)+F(14)+F(15) = 42

All the other symmetries lead to one of the three preceding constraints.

Finally, it happens that it has no more rule to apply. CAIA can choose between two possibilities:

1. It chooses an unknown, and it examines what happens when it substitutes it by one of its possible value. It carries on this task, instantiating an unknown, then generating new constraints, until it finds a contradiction or a solution. In both cases, it backtracks, instantiating the last unknown by the following possible value. CAIA is not very fast: it finds a solution every three seconds, taking into account that a new solution immediately leads to 47 other solutions, using the symmetries.

2. It writes a combinatorial program, cleverly using the constraints that it has found. This C program has 280 lines. CAIA compiles it, and executes it. Then, it finds 130 solutions every second.

In both cases, it chooses to begin backtracking with F(14), the small cube in the center of the large one. There are many solutions for each of the 27 possible values for F(14).

Here is one of these solutions:


 Its success is mainly due to its analysis of the problem formulation: it finds symmetries, and collects useful new constraints, particularly the three constraints with three unknowns. I did not know them myself, I had only found constraints with four unknowns. Naturally, when one knows that they exist, it is easy to prove them.

A human researcher, who would specifically solve this problem, could realize a more efficient system. However, the intelligence would be in his head, not in his system.

 I wanted to compare CAIA with other problem solvers. Francis Sourd kindly gave this problem to two of the best general solvers for problems defined by constraints. ILOG Solver did not find a solution in a reasonable time, while ILOG CEPLEX found the first one after 14 hours. Incidentally, it shows that CEPLEX is excellent: the search space is enormous when it is impossible to manipulate the formulation of the problem. Therefore, it does not know the value of VAL, and all of its constraints have 10 unknowns.

CAIA solved this problem in a way that I did not know. As Pégoud with his plane, we must let our systems fend for themselves!

 I doubt that, at the moment, there exists a general system that could approach the performances of CAIA for this problem, when it is placed in the same conditions: knowing only to find a bijection satisfying the 15 constraints given in part I. If I am wrong, I would be very interested in knowing more on this system.