# A mathematician is more than a theorem prover

I always believed that AI systems will become excellent mathematicians. Until now, AI focused its efforts on theorem proving, the most visible part of the mathematical activity. However, there are a lot of things to do other than proving theorems; one of them is to make conjectures, then he/she/it usually tries to prove it. This is not always the case, the main activity of EURISKO, developed by Douglas Lenat, was to make conjectures, which it did not prove. Ramanujan, a mathematician of genius, worked in an analogous manner: for many years after his death, distinguished mathematicians were still trying to prove some of his conjectures. Several conjectures are famous, such as Fermat’s last theorem that has been proved more than three centuries after it was stated. Goldbach suggested another well-known conjecture: any even integer greater than 2 can be written as the sum of two primes. Some conjectures, such as Riemann hypothesis, played a great role in the history of mathematics, theories have been developed in order to prove them.

In the February 2016 issue of Artificial Intelligence, C. Larson and N. Van Cleemput describe a system able to make conjectures in many mathematical domains. This system is not associated to a particular theory. Their work is based on a heuristic found and experimented since 1986 by S. Fajtlowicz. I did not know it, and this is not surprising, little was published about it: before this paper, it had never been referenced in Artificial Intelligence.

This heuristic may be used in many domains but not in all: the theory must include objects that have real number invariants, that are numbers linked to objects. For a graph, the invariants may be the number of vertices, of edges, the maximum degree of any of the vertices, etc. For a natural number, there are the number of factors, the number of distinct primes in its factorization, the number of its representations as sum of two primes, etc.

The system stores some objects of the theory under study; their number may be very low, for instance less than 10. For each object, it can find the value of all its invariants; therefore, it can check whether a new conjecture is true for all the given objects. The user indicates the name of an invariant I, for which he would like to conjecture the value of its lower or higher bound, using the other known invariants I1, I2, I3, …., In . The system knows a set of functions that can be applied to numbers such as addition, product, logarithm, etc. It will generate many conjectures such as C1 for I>I1+I2, C2 for I<=I1*I3, C3 for I<(I2-I4)², C4 for I>=log(I5+3),…The user may suggest other functions, especially those that are known to be useful in the present theory. In the different domains considered in the paper, it uses the following functions: frobenius norm, euler phi, mertens, next prime, and so on. This is the only situation in which the system is adapted to a particular domain.

When it has found a new conjecture, the system accepts it if it passes two tests:

Truth test: the conjecture is true for all the stored objects.

Significance test: for at least for one of the stored objects, the conjecture gives a stronger result than all the other known conjectures for the same object.

To clarify the interest of the second test, let us assume that we are looking for conjectures about the invariant G(n) when n is even: G(n) is the number of representations of n as sum of two primes. At the start, one has stored only two objects: 16 and 24. G(16)=2 (3+13 and 5+11) while G(24)=3 (5+19, 7+17, and 11+13). Conjecture C1 is: G(n) is greater than or equal to the number of digits in the binary representation of n less the number of its divisors. It is true for both objects: G(16)>=2 (5-3) and G(24)>=-1 (5-6). Now, conjecture C2 is: G(n) is greater than or equal to the number of digits of its decimal representation minus 1; this is also true since G(16)>=1 and G(24)>=1. However, C1 is stronger for 16 since it indicates that G(16)>=2 instead of 1 for C2; on the contrary, C2 is stronger for 24 where it predicts at least 1 instead of -1. Therefore, so far, one keeps both conjectures.

Naturally, finding a conjecture is not enough: one must prove it, and the conjecture must also be useful. For proving Goldbach conjecture (CG), that is G(n)>0 when n is even, one must find a new conjecture, which can be proved as a theorem, such that its value for G(n) is always at least 1. Even if C1 is true, it is not sufficient, since it only gives G(24)>-1. Therefore, it is certainly not a useful conjecture for proving CG, although it could give (if true) an interesting lower bound for some values of n. We are more lucky with C2: G is positive for both stored objects, and it has been also checked for many even numbers, but it has not yet been proved that the number of representations of n as sum of two primes is greater than or equal to the number of digits of n minus one. I give here only two very simple conjectures; the system found more of them, and checked them with much more than two stored objects. Perhaps, one day, it will find a conjecture predicting G(n)>0 that could be proved; then CG will also be proved.

The initial step in the research for a useful conjecture can be made with very few objects, then the system checks carefully that there is no trivial exception. For instance, after being discovered with few objects, conjecture C2 has been checked for n up to 1,000,000. After finding a useful and reasonably credible conjecture, a human or artificial mathematician has to prove it. Unfortunately, it may be impossible, even for a very clever artificial mathematician, to prove a conjecture because no proof exists although it is always true.

Anyway, I am impressed by the results from this system, particularly with conjecture C2. Firstly, this gives other ideas for proving CG; although it is much stronger, this could lead to consider new pathways. Moreover, if it is proved, it would be a satisfactory (although not very useful) result. For instance, C2 immediately states that:

G(1234567899876543210)>=18

which means that there exists at least 18 representations of this huge number as sum of two primes. With CG, we would only know that there is at least one representation. If I was a mathematician, I would be very proud if I could prove C2!

We can experiment this system, available at nvcleemp.github.io/conjecturing . Their program is implemented in the Sage open-source mathematical computing environment. Sage has a lot of built-in invariants for various mathematical objects. Therefore, it is easy to check with many instances, whether it is most likely that a promising conjecture is true.

This work does not solve all the problems linked to conjectures, but it shows that it is possible to add a new brick to the realization of what will someday be an artificial mathematician.

# Algorithms to Live By

This book, published in 2016, was written by a science journalist, BrianChristian, and by a specialist of Cognitive Psychology, TomGriffith; its subtitleis: The Computer Science of Human Decisions. AI researchers have always been very interested by human behavior: when we are developing a system that solves problems also solved by human beings, we try to find what methods we are using, and we include them in our programs. Here, the authors have a goal that is quite the opposite: they show that we, humans, can learn from the way computer systems solve problems, even for problems of our everyday life. They are not at all focused on AI: for them, every computer system may be interesting.

Sofar, I did not particularly think on this issue, but this is indeed an interesting idea. They are using many reallife examples of situations from everyday life, such as getting the best apartment, searching for a parking place, finding a restaurant, and so on. Some come from the authorspersonal experience.

I enjoyed their practical approach, but some academics may not like it, preferring to be abstracted from the real world. In the October 2016 issue of the AI Journal, ErnestDavis published an interesting review of this book; however, he does not like that way of presenting their views: «After reading about how Michael Trick used an algorithm to choose a wife, and how Albert McLay organizes breakfast for her children, and….., I started to have a craving for impersonal technical papers written in the passive voice.» For my part, I like this way of choosing situations easy to understand, without the need of a cryptic formalism.

I cannot consider all the topics examined in this book: when to stop looking, forget about it, when to think less, when to leave it to chance, the minds of others, and so on. The authors show that we could be more efficient if we borrow some of the methods used in computer systems.

For instance, in the chapter “Sorting-Making order”, they consider a problem that concerns all sports persons: how to order the best athletes. For tennis, the loser of a game is eliminated from the tournament. This provides the best player at the end, but produces little information about the strength of the others: the second best one was perhaps eliminated by the winner in the opening round. On grounds of efficiency, a race is much better than a fight: this gives a measure on the performance for all the participants. It is no longer necessary to fight again many players: at the end of a marathon, every runner knows his position compared to all the others.

Unfortunately, it is not always possible to eliminate fights everywhere. However, the solution chosen by the mammals for determine the alpha male is not very satisfactory because it often leads to too many fights. For their part, fish use a simpler method: the dominant is the biggest. With computers, sorting and ordering is made more quickly that fifty years ago, there are no longer unnecessary operations.

Tennis is not a perfect example: we are not so much interested in knowing who are the best players, we want many beautiful contests. If the ordering method is inefficient, we will watch more tennis matches! 150 years ago, Lewis Caroll showed that it was necessary to improve this method. Since that, there was some progress, particularly with the introduction of seeding; however, improvements are still possible. Personally, I think that one could use the difference in playing strength between the players during a match: the result was tight, one player was severely beaten, etc. A computer scientist knows that one can improve a result when an important information is not used.

Apart providing advices on how to organize ourselves better, this book also explains our behavior from understanding how computers work. For instance, it is well known that our memory deteriorates when we grow older. Wagging tongues discreetly hint to Alzheimer but, in the caching chapter, the authors explain that we have the same difficulties as computers to handle a lot of information. The more elements are in the memory, the more it takes time to retrieve what one is looking for: they call it “the tyranny of experience”. A ten years old child knows a few dozen friends; twenty years later, we know hundreds of people: in the towns where we spent some time, in the corporations where we have worked, the friends of our spouse, not to mention Facebook. Finding something can take more time if we stored it in an almost non-accessible area; we may even have decided to forget it in order to make room. When they have accumulated a substantial amount of information, old people and computers have the same problem: how to store it so that what will probably be useful will be quickly and easily accessible. Unfortunately, for everything remaining, one has to wait or to fail.

The authors essentially refer to Computer Science, although they mention several times “machine learning”, an AI subdomain. In my view, chapters inspired by AI could be added. Certainly, some AI methods cannot be used by human beings, mainly because our brain does not work fast enough: we cannot consider billions of Chess positions! However, in some domains, AI may be a source of inspiration, especially problem solvers that use heuristics.

Too often, teachers do not explain their students how they could solve a problem, and many students believe that there exist a special math skill (which they do not have) which is a prerequisite for solving mathematical problems. Let us take an example from Gelertner‘s Geometry Theorem Proving Machine: it must prove that an isosceles triangle (two equal sides) has two equal angles. Here is a machine proof:

It is assumed that triangle ABC has two equal sides: AB=AC. Triangles ABC and ACB are equal because their three sides are equal: AB=AC, AC=AB and they share BC. Therefore, the angle ABC of the first is equal to the corresponding angle ACB of the second.

If I had to write a Geometry program, I would have introduced a rule such as: if one must prove that two angles are equal, consider two triangles, each one containing one of these angles, then prove that these triangles are equal. It does not always succeed, in the present situation five triangles containing the angle ACB can be compared to triangle ABC: ACB, BAC, BCA, CAB, et CBA. The last four are not working, but the first one leads to the solution given before. With this rule, the student can understand how this solution has been found, and he can use it for other problems.

When an AI system makes many trials for finding the solution, it can indicate why it has to consider them. Even when they fail, it is interesting to show them because, in other situations, they may be successful. If the students could see how an AI system has found a solution, including its unsuccessful attempts, this would certainly allow them to perform more effectively afterwards.

This book is a most enjoyable read, but it could be completed by chapters adding Artificial Intelligence to Computer Science as a source of inspiration.

# Bootstrapping CAIA Part II: Choosing a new domain

There are many ways to increase the range of problems that CAIA could solve. For instance, one can introduce continuous variables, or define unknowns with an infinite number of possible values. It would also be interesting to consider meta-problems, that can be convenient to CAIA when it is solving a problem, for instance to monitor the search for solutions. Meta-problems are particularly useful in a bootstrap; however, when I made my last choice, I was afraid that it was too early to include such meta-problems: I didn’t have enough experience for monitoring the search for a solution. For this reason, I chose to consider problems where the unknowns may have an infinite number of possible values.

Firstly, this allows to have a large number of problems, mainly in arithmetic. More importantly, this necessitates to find new methods, different from those used for solving constraint satisfaction problems. Indeed, for these problems, the combinatorial method always gives all the solutions; it may be practically unusable, especially when the problem is intractable. However, this is a starting point, which can be improved.

Naturally, when an unknown may have an infinite number of possible values, one can no longer use the combinatorial method. Other methods must be used, some of them similar to theorem proving. This does not prevent to develop a tree, for instance one can consider several possibilities for an unknown: x<0, x=0, and x>0, or also x is even and x is odd. For many problems, looking at all the possible values, even very cleverly, is not always the best method; CAIA is obliged to experiment with other approaches in this new domain.

When the system stops, the result from these problems may be:

The proof that there is no solution: find two integers a and b such that 4a+3b²=34.

The discovery of all the solutions: find three integers a, b, et c, greater than a given integer k, such that the three remainders of the division of a.b by c, of b.c by a, and of c.a by b are all equal to k. For instance, if k=12, one solution is a=293892, b=1884, and c=156; in all, there are 792 solutions.

The discovery of one or several families describing an infinite number of solutions, which include all the possible solutions: find two integers a and b such that 4a+3b²=36 (any integer value for x, a=9-3x², b=2x).

There is a solution for every combination of the possible values of the unknowns: find n such that 2006 divides 2005n-1887n-1954n+1836n ; this constraint is true for any positive integer value of n.

No solution, or a finite, or an infinite number of solutions, have been found, but one has not proven that there is no other solution. This happens when one is stuck for at least one leaf of the tree.

For the combinatorial search in constraint satisfaction problems, sometimes one has not all the results because the resolution may require too much time, especially when the problem is intractable. Here, there is a new possibility: one does not see what could be done, all the known methods have been unsuccessfully used. In the tree, we have not only as value for the leaves: solution, or contradiction, or not enough time, but also: one does not know what to do.

When a system results from a bootstrap, too often it does not find some solutions, although it believes that it has completed its task. When CAIA finds a solution, it is always correct; however, as the search is not as systematic as when the set of possible values are finite, sometimes it erroneously concludes that there is a contradiction, or it wrongly eliminates a possible value for an unknown. When a system builds itself its method for finding solutions, a misguided modification in the knowledge that creates these methods can improve its performances in some situations, but may introduce particularly vicious mistakes in parts that led before to excellent results. I have experimented with systems where I gave knowledge, and with others where I also gave meta-knowledge creating knowledge. Undeniably, developing the second approach is tremendously more difficult. One reason is that, if AI systems are brittle, systems that result from a bootstrap are much more brittle. In another post, we have seen the problems coming from meta-bugs.

Naturally, to proceed further, it is necessary to introduce new kinds of problems, until it can solve any problem. As of now, I do not want to consider other families of basic problems, such as games or various mathematical theories. I prefer to add families of meta-problems that will be useful at the meta level, where one has to build new methods for solving problems, or to monitor efficiently the search for solutions. It is much more necessary for quickly moving forward in this bootstrap, and particularly to have a less brittle system: experience has shown that, when CAIA does something for me, it does it better than myself.

# Bootstrapping CAIA Part I: The initial domain

I can’t just say “one must bootstrap AI”, how to do it must also be explained. Naturally, I will use my own experience in CAIA’s development, started in 1985.

I shall quickly go through the first step, where I defined a language and knowledge for translating itself into C programs. This is well known by those who write a compiler in the language of this compiler. It was interesting to define a new language rather than using an existing language for two reasons:

1. This language must change over time, so that it becomes more and more declarative. At the beginning, in order to facilitate the compilation, it has many procedural aspects. They are gradually replaced by more declarative possibilities. Declarativity is essential because it is easier for the system to create declarative rather than procedural knowledge; it is also easier to study its own knowledge when it is given in a declarative formalism. Very important elements in this language are sets and bags, which do not imply an order to be followed, contrarily to the lists. Expertises are sets of rules, and rules have sets of clauses.

2. It is important that CAIA and myself could thoroughly examine any module of CAIA when it is executed. An unrestricted access to the present state of CAIA and its knowledge by CAIA itself is essential if we want to give it a kind of consciousness. It is easier to have something that suits us when the system is specially built for this purpose. Black boxes are great foes of intelligence: they restrict consciousness since one does not know what happens when one executes them. There are still two main black boxes for CAIA: the operating system and the C compiler.

Using its knowledge, CAIA translates all its knowledge, either given in the initial formalism or in more declarative formalisms that I later introduced. Since thirty years, CAIA does not contain a single line of C that I have written. All in all, there are 500,000 lines of C, and 13,500 rules. Many rules have not be created by myself, but from rules that create rules.

AI essential goal is to create a general problem solving system; all the human activities are, in fact, problems that we have to solve. The most important problem for AI researchers is to realize a system that could solve every problem, including this last one. It is foolish to begin with the most difficult problem, it is better to consider simpler problems, then to extend this domain gradually. This is one of the main directions of a bootstrap, the other one being to improve the performances.

It is important to choose the initial domain well. CAIA started to solve problems defined by a set of constraints. Firstly, this domain is interesting because many problems may be defined in that way; but it is even more interesting because some of these problems may be used by the solver itself. Since a long time, I have added two such problems:

1. To find symmetries from the study of the formulation of a problem can be stated as a constraint satisfaction problem. This is useful because it reduces the size of the search space, it facilitates the search for new constraints, and it enables to find a decomposition of the search space so that proofs will be easier in each area. I will explain this last point in a future blog, I completely underestimated it: it was a pleasant surprise.

2. For experimenting with a system, it needs many problems. Often, there are not enough problems in the literature, or I have to give them myself; moreover, there is a lack of very, very difficult problems, which are far too hard for human beings. However, finding new problems is a problem that can be defined with constraints; therefore, many CAIA’s problems have been found by CAIA itself.

In the next blog, we will see how a first step has been made to extend this initial domain. When CAIA will be able to solve any problem, one direction of the bootstrap will be completed; however, it will also be necessary to solve these problems more and more efficiently.

# Everything but the essential

One Hundred Years Study in Artificial Intelligence is the title of a very interesting report that has just been published by a group of prominent researchers in AI. One of their goals is that «the report can help AI researchers, as well as their institutions and funders, to set priorities».

The report considers a large variety of domains; it shows that AI could be very helpful in situations where it is difficult to find an adequate staff. This is especially the case for health care. In particular automated assistance for the clinicians, image interpretation, robotics, elder care, etc. are very promising directions of research in this domain. Self-driving cars and home robots will change our day-to-day life. With intelligent tutoring systems, the students will get help adapted to their needs. This report provides an overview of many activities where AI systems will be able to help us in the next fifteen years.

It seems that the authors have begun to explore what the future needs will be; then, for each one, they have carefully examined what AI techniques could be useful. For that purpose, they did a wonderful job: if we make the required effort, many applications will be in widespread use throughout the world in 2030.

Curiously enough, the authors have completely forgotten a domain with really high needs, which they should be familiar with: to help the development of AI research!

It is very difficult to conduct AI research, especially if we just don’t imitate human behavior, where we improve our own results thanks to powerful computers. This vision of intelligence is anthropomorphic: other forms of intelligence exist. Unfortunately, building a super-intelligent system is perhaps too complex for human intelligence with all its limits: we have been shaped up by evolution only for solving the problems of our hunter-gatherer ancestors.

Serendipity allows us to produce acceptable results in new domains, such as computer science, mathematics, management, etc. However, they are necessarily limited: evolution lacked time so that we can adapt to their specific requirements. Our geniuses are probably not as good as they think, in the kingdom of the blind, the oneeyed is king. We are awfully handicapped by our working memory, with only 7 elements, and by our reflective consciousness, which ignores almost all the events that occur in our brain. Therefore, we will not get far without AI help. If we want to increase more and more AI performances, we must call upon AI assistance.

This has been completely ignored by the report, where the word “bootstrap” does not even occur. However, this is a technique well adapted to the resolution of very difficult problems; moreover, developing AI is an intelligent activity, therefore it depends on AI. The authors only say: «No machines with self-sustaining long-term goals and intent have been developed, nor are they likely to be developed in the near future.». This is almost true for the existing state of AI, but it is imprudent to predict what will happen in the next 15 years: progress is very slow in the beginning of a bootstrap then, suddenly, things move quickly. Naturally, if this kind of research has no priority, this is a self-fulfilling prophecy: we will be at the same situation in 15 years. Nevertheless, if this kind of research receives a small part from the funds rightly allocated to the applications described in the report, the results will likely surprise us.

The importance of bootstrapping AI is seen since its beginning: in 1959, Allen Newell and Herbert Simon considered the possibility for their General Problem Solver to find the differences necessary for GPS itself. The bootstrap has two major drawbacks: it is slow, and hard to achieve. However, many achievements of our civilization come from a bootstrap: we could not design our present computers if they did not exist.

Few AI researchers are presently interested in this approach. I suspect that many AI researchers even think it is impossible, while most of the others believe it is much too difficult to embark on this path, taking into account the current state of development of our science. Stephen Hawking is really a genius: although he is not an AI researcher, he has seen the strength of such an approach. It is unfortunate that he wrongly concluded on the dangers of AI.

What is in this report is excellent, I am only critical of what is missing, which is essential for the future of AI. In the end, it lacks coherence: the authors rightly say that AI will significantly help the resolution of many important problems in many domains. However, they do not think to use AI for the most important task: to advance AI, which is central to the success of all the other tasks considered in this report!

A turn for what? Well, I have seen twice the same succession of events. Firstly, outstanding specialists of a domain assert, without any justification, that an AI system could never outsmart the best human experts of this domain. For many years, they are right, and they laugh at the poor results obtained by the first programs. Then, suddenly, everything changes: artificial systems outperform the best humans.

This happened for Chess, and recently for Go. I believe that the next domain where this sequence of events will occur is mathematics, a propitious domain for several reasons. Firstly, as games, mathematics is far removed from reality; the researcher has not to confront the many constraints coming from the real world, such as it is the case for driving a car. Secondly, games have been created by humans, based on what they can do best: from the start, we, humans, are in a good situation. On the contrary, mathematics has not been developed to solve problems that are not too difficult, but to solve useful problems. It is not obvious that mathematics is as well suited to our capacities as games; therefore, artificial beings are in a much better situation than for games: they could use all their qualities, in situations where they are essential, and where we are not very good. Our intelligence results from the importance for our ancestors of hunting and fighting for surviving and breeding. The fighting aspect is important in games, but not in mathematics. There is no reason why human intelligence would be well adapted to mathematics. It is simply the case that some capacities developed for other goals are somewhat helpful for mathematicians.

For the present time, no limit is known that would prevent AI systems to outperform human mathematicians. The well-known incompleteness theorems restrict what can do any cognitive system, human or artificial. Nobody could ever prove a result if no proof exists for this theorem in this theory.

Fist-class mathematicians have claimed that machines could never be very strong in mathematics, but they never justify this assertion. For them, it is not necessary, it is evident. However, a good mathematician should know that, when there is a mistake in a proof, it is often when one says that something is evident. Probably, they are so confident because they do not see how this could be made. This is normal: they are mathematicians, not AI researchers, this is not their job. In the same way, Chess and Go players’ job was not to write Chess and Go programs: they could not foresee what happened.

Mathematics is also a very interesting domain for AI because their results are extremely helpful. It is natural to invest in artificial mathematicians: their discoveries will be more useful than those of a Chess program. Naturally, their role will not be only to prove conjectures: they will have to build new mathematical theories. Pioneering works, such as those of Douglas Lenat, have shown that an artificial mathematician could discover concepts it did not know. For instance, for a function, it is interesting to consider the situations where the values of two arguments are the same: from addition one discovers the concept of double, and from multiplication, the concept of square. As extreme situations are often remarkable, considering them leads to the concept of number with few divisors, the prime numbers. Surprising its author, Lenat’s program studied also the numbers with many divisors; then Lenat discovered that Ramanujan had already studied them. Naturally, a lot of work is necessary for developing systems with many more possibilities that this old system, but this is certainly not impossible.

CAIA has a powerful method for solving problems where it cannot use a combinatorial method because there are too many possibilities, and even sometimes an infinite number. With the meta-combinatorial search, one does not consider all the possible values of a variable, or all the legal moves; instead, one considers many methods for trying to solve the problem. One can waste a lot of time, with methods that fail; however, at the end, one has an excellent solution, as one keeps only the methods that are necessary for justifying it. When one has a good set of methods, one may find sometime surprisingly elegant solutions. As computers are much faster than human brain, they can perform a meta-combinatorial search wider than those made by human mathematicians.

To date, AI has progressed rather slowly in the Universities, which are certainly not the right place for developing very complex systems quickly. Since recently, industrial firms have heavily invested in AI realizations; it is no coincidence that IBM succeeded for Chess and Jeopardy!, and Google for Go. They moved to the upper category very fast: before AlphaGo, I did not think that an AI system would win again one of the best professional Go player before ten years.

In view of the importance of mathematics, it will be tempting to create a competent team, with substantial resources: they could rapidly have results more valuable than would be expected. Human mathematicians would help us to understand the results of their artificial colleagues, so that they could be used efficiently and effectively. Naturally, they will also continue to do mathematics for the fun, in the same way as Chess and Go players are always playing their favorite game.

# A long-awaited success for Go

In recent years, for most games, programs won against the best human players. However, one game was still dominated by human experts: Go.

Go has a unique place among games: the number of legal moves is around 250, and the game length is near 150. If one wants to consider all the possible moves, the tree is so large that it is infeasible. For such games, a solution is to have an evaluation function, that predicts the value of the corresponding terminal position. Go is so complex that researchers were not able to find a satisfactory function. Winning against the best Go players was the holy grail of AI.

Therefore, for a long time, human Go players made fun of programs, which did not even have the level of a beginner. In the 70s, Chess players had the same contempt for Chess programs.

Google Deep Mind has just developed an extraordinary system, which has won 4-1 against one of the best Go players in the world: Lee Sedol. For this achievement, AlphaGo received an honorary “ninth dan” professional ranking, the highest possible level. This citation was given in recognition of AlphaGo’s “sincere efforts” to master Go’s Taoist foundations and reach a level “close to the territory of divinity”.

AlphaGo is based on general purpose AI methods, which have already been used for Go programs, but with far less success. A paper, published in Nature, gives a lot of information on this system: Mastering the Game of Go with Deep Neural Networks and Tree Search.

It uses neural networks for learning from a huge set of results: the more results, the better. In that way, it builds several functions: one of them estimates the value of a position, and another predicts good moves in this position. For one of these functions, it has used a database with 30 millions moves. It is also able to use the moves generated when AlphaGo plays against itself. It is extraordinary that AlphaGo already plays well using only the neural networks, without developing a tree.

However, it plays better when it develops a tree, while always using its neural networks. For considering the future possible moves, AlphaGo uses random Go, that has been introduced in most-recent Go programs, after Bruno Bouzy shown that it gave a strategic vision to the program. For evaluating a move, one plays it; from this position, one develops thousands of paths. At each step, one move is chosen randomly among the best moves found by the neural network. When this is completed, it chooses the move that has the highest mean of all the terminal positions of its paths. Naturally, parallelism is very useful for this kind of search: At Seoul, AlphaGo had 1202 CPU and 176 GPU. However, if it develops a very large tree, it is significantly smaller than the trees developed by Deep Blue against Kasparov. This method was very efficient in the end game, where it was playing perfectly, because the number of legal moves is not high. Its opponent had no chance of winning the game, if it had not an advantage at the end of the middle game.

The authors of this program have masterfully coordinated all these methods for developing a program with outstanding performances. I believed that some day Go programs would be stronger than the best human players; I did not expect that it would happen so fast.

It is interesting to note that an industrial team achieved this success story. Google provided appropriate means to face this challenge, and the researchers could dedicate all their energies to this goal. On the contrary, a university researcher has many additional time-consuming tasks: teaching, writing papers, administration, seeking funds for his projects, etc. At least 50% of our time must be affected in the development of a complex computer system: without this, it is impossible to control anything. It is exceptional that a university researcher has this possibility. Probably, the future of AI is not in the Universities, but in large industrial companies such as Google, IBM, Facebook, etc.

I am slightly less enthusiastic on a minor point. In a preceding post, I have said that AI researchers have two flaws: they are too clever, and they work too much. Therefore, their systems include too much human intelligence, and not enough artificial intelligence: modules designed by humans must be replaced by meta-modules that create these modules. My criticism may be a bit exaggerated, since many important components such as the functions used by neural networks have been automatically learned. However, the Nature paper was co-authored by 20 researchers, and that makes a lot of human work and intelligence.

A negative consequence: AI has no longer a holy grail! We need a new spectacular task that is not too easy, but not impossible. Google, which just removed the preceding one, could use its search engines to find a new holy grail from its massive amounts of data.

To conclude, one cannot help but admire this result, which happened long before the most optimistic researchers expected it.

# Should we disclose everything? Part I

When a human or an artificial being has solved a problem, must he indicate all that he has done? Obviously, one must not describe everything, too much information would be worthless. However, is it sufficient to indicate only what is enough to check that a solution is correct? For my part, I don’t think so; we will see some aspects of this problem in the following posts. Here, we will only show that it is possible and useful to explain why one has chosen the steps taken when solving a problem, even those that are not necessary to justify the solution.

Usually, people tell nothing about this kind of knowledge, often because the solver does not know it: these mechanisms are largely unconscious. A chess player knows what moves he has considered, but he rarely knows why he has considered them. A pity, because de Groot’s work has shown that the strength of the strongest chess players derives from their ability to consider the best moves. In an experiment, world champion Alekhine had to think aloud while he was finding the best move in a complex position; he considered only four moves, among them the two best ones. Unfortunately, he did not say why he thought to consider them, probably because he did not know it: looking at a chess board automatically gave him the moves to consider. On the contrary, a good player had considered nine moves for the same position, but the best one was not among them.

Likewise, mathematics teachers rarely indicate why the steps of a proof have been chosen, and also why ways that seem promising do not lead toward the solution. This is unfortunate, this cause some misgivings about the method used by mathematicians: it is normal to wander, the best mathematicians have not directly found the results that have made them famous. Some students do not even try to find the solution of a problem because they believe that they are too incompetent: indeed, it is impossible to find only the good steps, in the way the teacher presents the proof of a theorem. What is important is to try, to fail, and to try again.

I was most impressed by the description, given by Martin Gardner, of the stages that have resulted in the resolution of the squaring the square problem by four maths students. The goal was to find a square covered with squares, the sizes of the small squares being all different. When we are looking for a solution, it is easy to check whether it is correct; this does not indicate how this square has been found. In fact, the students began to find a rectangle covered by small squares, all different. They enjoyed it a lot, and they built a rectangular box containing all these wooden squares. One of them, who lived at his parents, let it on his desk, and his mother was not entitled to clean this desk. Naturally, she could not resist cleaning it and, in doing so, the puzzle fell to the ground. Her disobedience being obvious, she tried to put the squares back into the box, and she succeeded, hoping to escape her son’s reproaches. However, her son discovered that his mother has found a solution different from the preceding one. Immediately, he called his friends, and they tried to understand why this puzzle had several solutions. That gave them the idea of an analogy with Kirchoff’s laws, the size of a square corresponding to the value of a resistance in an electrical circuit. They realized that the circuit must have some characteristics, and that led them to the discovery of the first squared square.

Knowing why some rules have or have not been executed would also be useful for CAIA: it could be used to improve its meta-knowledge used for selecting, prohibiting, or ordering the possible actions. In its explanation of a solution, as CAIA can completely observe any of its actions, it knows the reasons for the elimination of a result, for its choice, for the priority allocated to each action. Unfortunately, if it can know how meta-knowledge was used, it is not so easy to improve it: one must define meta-meta-knowledge, which is always very difficult to create and to use.

When an AI system uses an algorithm that tries everything possible, there is no need to explain the choice of its steps. However, CAIA dynamically determines whether it will keep or eliminate a result, or whether it will use it immediately or save it for difficult times. It could produce an interesting explanation of its actions such as: it did not keep this new constraint because it contained many unknowns, or because it was an inequality and it already had too many of them, or it delayed using this rule because its execution is time-consuming, and so on.

For an explanation to a human being, it is possible to include some indications on the reasons why CAIA has chosen to try an action; the difficulty is to model this human for giving him only what he needs to know. On the other hand, such explanations given by CAIA to itself would allow it to improve its own meta-knowledge. Both changes could be made; unfortunately, as they are difficult to implement, so far they are only in my to-do list.

# Superintelligence

Superintelligence is the title of a book written by Nick Bostrom. This book is enjoyable to read, the author is well aware of what has been done in AI. For instance, he considers bootstrap as one of the possible ways for creating superintelligence, he mentions the crossover point where the intelligence of the improving system will become greater than human intelligence. Moreover, quoting Yudkowski, he notices that our intelligence is in a very narrow range, in spite of “the human tendency to think of “village idiot” and “Einstein” as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds in general.”

I completely agree with the author on the possibility of a superintelligence. However, I do not see the future as he does; in fact, I do not see the future at all. While reading this book, I had the feeling that the author had never realized any AI system; from his biography, it seems that it is true. He presents the future of AI in a very loose and undefined way: he does not give specific information about how to do it. I am very well placed to know that it is not enough to say that one must bootstrap AI, it needs to be indicated how to do this effectively, and it is not that easy. I have plenty of ideas for advancing CAIA, but I have not a clue about their final consequences.

One cannot fully appreciate what will become a particular AI system, especially if one has never realized such a system. In its final stage, it bears little resemblance with its initial version. We must be humble about predicting future consequences of a good idea.

In order to understand better the difficulty of predicting the future of a field of research, just look at computer science. I began working in this domain in 1958; at that time, people thought that it was excellent for computing trajectories or for issuing payrolls (incidentally, my first program solved an integral equation). In these years, I had never read a paper foreseeing computer omnipresence in the future world, where everybody uses several computers, even in the phones, the cars and the washing machines, and where a thing such as the web could exist. In the same way, we have no idea of AI future application: we are not smart enough, and AI is still in a rudimentary state. This does not allow us to have a reasonable idea of superintelligence.

Many are concerned that machines will size power over man because we assume that they will behave as ourselves. However, our intelligence was built for our hunter-gatherer ancestors by evolution, which promotes procreation and search for food. Sizing power is an important step for these activities, no wonder that it is widely displayed among living beings, including aliens if they exist, who are improved by evolution. The goals of super-intelligent beings are not necessarily the same as ours; therefore, the search for power is not always essential for their activity. Perhaps, there will be artificial super-mathematicians, quite happy to develop complicated theories; to do that, there is no need to rule the world and to enslave us.

In his book, Bostrom considers various methods for not being dominated by artificial beings, such as boxing methods that confine the system, or tripwires detecting signs of dangerous activity. However, if we confine the system, it will never become super-intelligent; furthermore, if it is super-intelligent, it will detect and bypass our tripwires. In the March 2015 issue of the AI Journal, Ernest Davis analyses this book, and suggests that the issue will be settled by “an off switch that it cannot block.” This is a variant of the idea of unplugging the machine from the mains; Science Fiction has long shown that it does not work. Even for AI specialists, it is difficult to imagine the immense superiority of superintelligence!

Let’s go back to the general scale of minds, where animals are well below us, even those rather intelligent such as dogs. Therefore, if we are able to realize a super-intelligent system, the difference between ourselves and this system could be as large as the difference between a dog and ourselves. For understanding our position vis-à-vis super-intelligent systems, we can get inspired by the position of a dog vis-à-vis ourselves. Considering again the efficiency of tripwires, might a dog invent an off switch that we could not bypass? How could a dog imagine many of our activities, such as writing, mathematics, astronomy, AI, and so on? No doubt that we also have no idea about many essential activities, as well as the ancient Greeks had no idea about Computer Science. Long into the future, super-intelligent systems will possibly discover new applications; we will be able to use them because it is much easier to use than to find.

Who knows, perhaps some day our descendants will harmoniously live with super-intelligent systems, and they will be very satisfied with this. When we see how the world is ruled, we can hope better for future generations. This may happen if super-intelligent beings do not want to control the world, but to help us. As we have said before, this is possible if superintelligence is not created by a kind of evolutionary process, based on the pursuit of power and control.

Anyway, we are very far away from this distant future. I am not sure that our intelligence is sufficient for allowing us to create superintelligence; however, if we succeed, the result will certainly be very different from all our predictions. Only when we have made further progress, we will be able to predict the consequences of super-intelligent beings accurately. In the meantime, we are playing at frightening ourselves.

# Poor Artificial Intelligence: little known and little liked

I have the sad honor of working in AI, a domain that is frowned upon by many remarkable people. We are dealing with two main categories of opponents. I have already spoken of the first ones, such as Stephen Hawking, who think that AI is potentially very dangerous: the emergence of super-intelligent artificial beings will lead to the extinction of the human species. On the contrary, for others it is impossible to create artificial beings more intelligent than ourselves. A recent interview shows that Gérard Berry belongs to this last category.

Gérard Berry is a computer scientist who has made outstanding works, deservedly recognized by the scientific community: he is a member of the Académie des Sciences, professor at the Collège de France, and he has received the CNRS gold medal, France’s most prestigious scientific distinction, the second time it honored a computer scientist. For me, AI is not a part of computer science, but both are very close: computer is an essential tool for implementing our ideas. Therefore, when a top-level computer scientist criticizes AI, this must be taken very seriously.

AI occupies only a small part of this interview, but he is not mincing his words in these few sentences. Here are two excerpts:

“I was never disappointed by Artificial Intelligence because I did not even believe for a second in Artificial Intelligence. Never.”

“Fundamentally, man and computer are the most different opposites that exist. Man is slow, not particularly rigorous, and highly intuitive. Computer is super-fast, very rigorous, and an absolute ass-hole.”

Firstly, I will consider three points mentioned in this interview, where I agree with him:

Intuition is certainly an essential characteristic of human intelligence. In a recent blog, I have looked at how AI systems could also have this capacity.

Chess programs are a splendid realization, but their intelligence is mainly the intelligence of their developers. They used the combinatorial aspects of this game, enormous for human beings, but manageable by fast computers. It would have been much more interesting to speak of another outstanding IBM achievement: Watson, an intuitive system, won again the best human players for the game Jeopardy!

For the present time, AI byproducts are the most useful results of our discipline: they allowed to discover important concepts in computer science.

Let us talk about our disagreements. Up to now, nobody has proven that man could have an intellectual activity that no artificial being could ever have. As long as we are in this situation, we must assume that all our activities can be mechanized. Then, we are in a win-win situation. If we show that we are wrong, a significant progress would be made by a reduction ad absurdum argument. On the contrary, if we are right, we will create very helpful artificial beings, which will not be restrained by our intellectual limits.

The main argument that appears in this interview is that computers are absolutely stupid. I am way back in 1960! At that time, many people already wanted to show the impossibility of intelligent artificial beings. Their argument was: computers only do what they are told to do. Naturally, this is true, but this proves nothing: the problem is to write programs, and to gather data, so that the computer will behave intelligently. One can develop programs that do other things than running as fast as possible on their data. Programs can analyze the description of a problem, and then write and execute efficient programs well adapted to a particular problem and its data. Moreover, in a bootstrap, the existing system works with its author for creating an improvement of the system itself. Computer users strongly believe in the usefulness of bootstrap: without computers, it would have been impossible to design the current computers! Hawking’s extraordinary intuition had seen the efficiency of bootstrapping AI; this is why he was afraid of its future.

If one has never proven that man can do something that no computer could ever do, many things can be done by a computer while no human being could ever do them. For instance, our reflexive consciousness is very limited: most of the processes in our brain are unconscious. On the contrary, it is possible to realize a computer system that can observe any of its processes if it wants to; moreover, it can have access to all of its data. As consciousness is an essential part of intelligence, this will have tremendous consequences. Unfortunately, we are not yet able to make full use of this capacity because man’s consciousness is too limited for being a useful model.

AI is probably the only scientific domain with so many staunch opponents, although they do not know it. This is not surprising: man has always rejected what would remove him from the center of the world. We all know the difficulties encountered by those who said that the earth revolved around the sun, or that apes were our cousins. Naturally, the idea that artificial beings could be more intelligent than ourselves is particularly unpleasant for the most intelligent among us. Less intelligent people are used to live with beings more intelligent than themselves, but geniuses are not. Of course, some of them are strongly against AI.

I believe in the possible existence of artificial beings much more intelligent than ourselves. On the other hand, I am not sure that we are intelligent enough to achieve this goal. We need help, and for this reason, since many years, I am trying to bootstrap AI: as of now, my colleague CAIA, an Artificial Artificial Intelligence Researcher, gives me a substantial assistance for its own realization. AI is probably the most difficult task ever undertook by mankind: it is no wonder that progress is very slow.