In recent years, for most games, programs won against the best human players. However, one game was still dominated by human experts: Go.
Go has a unique place among games: the number of legal moves is around 250, and the game length is near 150. If one wants to consider all the possible moves, the tree is so large that it is infeasible. For such games, a solution is to have an evaluation function, that predicts the value of the corresponding terminal position. Go is so complex that researchers were not able to find a satisfactory function. Winning against the best Go players was the holy grail of AI.
Therefore, for a long time, human Go players made fun of programs, which did not even have the level of a beginner. In the 70s, Chess players had the same contempt for Chess programs.
Google Deep Mind has just developed an extraordinary system, which has won 4-1 against one of the best Go players in the world: Lee Sedol. For this achievement, AlphaGo received an honorary “ninth dan” professional ranking, the highest possible level. This citation was given in recognition of AlphaGo’s “sincere efforts” to master Go’s Taoist foundations and reach a level “close to the territory of divinity”.
AlphaGo is based on general purpose AI methods, which have already been used for Go programs, but with far less success. A paper, published in Nature, gives a lot of information on this system: Mastering the Game of Go with Deep Neural Networks and Tree Search.
It uses neural networks for learning from a huge set of results: the more results, the better. In that way, it builds several functions: one of them estimates the value of a position, and another predicts good moves in this position. For one of these functions, it has used a database with 30 millions moves. It is also able to use the moves generated when AlphaGo plays against itself. It is extraordinary that AlphaGo already plays well using only the neural networks, without developing a tree.
However, it plays better when it develops a tree, while always using its neural networks. For considering the future possible moves, AlphaGo uses random Go, that has been introduced in most-recent Go programs, after Bruno Bouzy shown that it gave a strategic vision to the program. For evaluating a move, one plays it; from this position, one develops thousands of paths. At each step, one move is chosen randomly among the best moves found by the neural network. When this is completed, it chooses the move that has the highest mean of all the terminal positions of its paths. Naturally, parallelism is very useful for this kind of search: At Seoul, AlphaGo had 1202 CPU and 176 GPU. However, if it develops a very large tree, it is significantly smaller than the trees developed by Deep Blue against Kasparov. This method was very efficient in the end game, where it was playing perfectly, because the number of legal moves is not high. Its opponent had no chance of winning the game, if it had not an advantage at the end of the middle game.
The authors of this program have masterfully coordinated all these methods for developing a program with outstanding performances. I believed that some day Go programs would be stronger than the best human players; I did not expect that it would happen so fast.
It is interesting to note that an industrial team achieved this success story. Google provided appropriate means to face this challenge, and the researchers could dedicate all their energies to this goal. On the contrary, a university researcher has many additional time-consuming tasks: teaching, writing papers, administration, seeking funds for his projects, etc. At least 50% of our time must be affected in the development of a complex computer system: without this, it is impossible to control anything. It is exceptional that a university researcher has this possibility. Probably, the future of AI is not in the Universities, but in large industrial companies such as Google, IBM, Facebook, etc.
I am slightly less enthusiastic on a minor point. In a preceding post, I have said that AI researchers have two flaws: they are too clever, and they work too much. Therefore, their systems include too much human intelligence, and not enough artificial intelligence: modules designed by humans must be replaced by meta-modules that create these modules. My criticism may be a bit exaggerated, since many important components such as the functions used by neural networks have been automatically learned. However, the Nature paper was co-authored by 20 researchers, and that makes a lot of human work and intelligence.
A negative consequence: AI has no longer a holy grail! We need a new spectacular task that is not too easy, but not impossible. Google, which just removed the preceding one, could use its search engines to find a new holy grail from its massive amounts of data.
To conclude, one cannot help but admire this result, which happened long before the most optimistic researchers expected it.