They are going to judge us!

Several AI systems currently have performances well above those of the best human experts. This allows the realization of systems that can assess the quality of human performances, much better than we could do.

Particularly, for many games, some AI systems are far better than the world champion: a Go program has recently won against two of the best Go players. Here, we will discuss of an outstanding study made by Jean-Marc Alliot on Chess; it was published in the first 2017 issue of the Journal of the International Computer Games Association.

For about fifty years, a method is used to determine the strength of chess players: an integer, his Elo, is associated with each player. It is computed from the result of every match that he has played (win, draw, loss), and from the Elo of his opponents. Now, Magnus Carlsen, the world champion, has also the highest Elo: 2857; less than 800 players have an Elo greater than 2500, most of them are International Grandmasters. It is difficult to evaluate Elo for the best chess engines: human players are not strong enough. Therefore, matches between human and computer have become very rare. Moreover, when a human agrees to play, he often requires to fight against a crippled engine for instance, without endgame table base or with odds (usually a pawn). As such, Elo for chess engines is mainly based on competitions between themselves. For the present time, the best ones are almost at 3400. With Carlsen, the difference is over 500; this indicates that the engine would win a game with a 0.95 probability.

Lacking anything better, chess players were content to use this rating system, although it can evaluate neither the moves, nor the quality of a game. Now, for the best chess engines, we can consider that they are playing the best move. If a human player chooses another move, one can evaluate its quality: it is sufficient to find the value given by the computer to the position after its move and the one after the human move. The value after the computer move is always greater or equal to the human move: if it was lower, it would not have played its move.

In fact, the author could not use a system with all the Elo difference that is theoretically possible: he cannot access a computer with all the processors which were used for the best performance; moreover, in order to limit computer time, the time allowed to a move was decreased. The system used for these experiments was a chess engine in the top three, STOCKFISH, which has also the benefit of being open source. With the restrictions on the computer speed, the Elo advantage on the world champion is now 300; the engine will still win, but only with a 0.85 probability.

Knowing the difference in value between the computer move and the human move allows to know the quality of each human move exactly. This could be useful to annotate a game, showing the good moves and the weak ones and, for each of these, to indicate what would have been the best move. However, this was not the goal of this paper, which has other purposes. The basic element is the construction of a matrix giving the probability that the move played by a particular player will change the value of the position; this probability depends on the value of the position before this move.

For each year of activity, this matrix has been computed for all the players that have been world champion, but not for those who played against a world champion, and never win: all the Ponomariov games have been considered, but not all those of Kortchnoi. Naturally, one considers only the games played at regular time controls, and in normal conditions: one does not keep blitz, blind, simultaneous, odds games. There is one matrix for playing White, and another one for playing Black. All in all, the system has analyzed 2,000,000 positions.

An element of the matrix is the probability that, when the value of the position is VA, the value of the position after the move has been made is now VB. The values are measured in pawns or in centi-pawns. For instance, we know that, in 1971, if Robert Fischer, playing White, is one pawn late in a position, after playing his move, he would still be one pawn late with a 0.78 probability, 1.4 pawn late with a 0.12 probability, and 1.8 pawn late with a 0.10 probability. The new value can never be better than the old one since we assumed that the machine is infallible.

Thanks to these analyses, the author describes several interesting experiments; for example, it is possible to find the probability of the result of a match between two players, when the matrix is known for both. One assumes that the game is won by a player when he is at least two pawns better. As one has Black and White matrices for Spassky-1971, and the Black matrix for Fischer-1971, it is possible to compute the ten elements vector that gives the probability of the result of a game between these players in 1971. Here are some of these values: Fischer wins (0.40), Fischerâ€™s advantage at the end is 0.6 pawn (0.07), perfect equilibrium (0.14), the final position is -1.4 pawn (0.01), Fischer loses (0.07). I emphasize that, at this step, the computer plays no move: it uses the vectors indicating the performances of each player. It only plays moves for computing the matrices.

These methods play a different role than the Elo, which evaluates a player for all its games against many players from only their result. Here, one is interested in the moves, and not by the result. Moreover, one does not define a vector against any player, but against one particular player in a particular year. With this method, it is possible to find the result of a match between Fischer-1971 and Fischer-1962! The paper gives the results of a virtual competition between the 20 world champions, taking for each one the year where he was the best, which is not always the one where he was the world champion. For instance, we learn that Kramnik-1999 had a 0.60 probability to win against Lasker-1907. In some cases, the result is analogous to Condorcet paradox: Petrosian-1962 wins against Smyslov-1983, who wins against Khalifman-2010, who wins against Petrosian-1962!

I cannot summarize a paper which contains many remarkable results from the analyses made from all the matches played by at least one past, present or future world champion. In his conclusion, the author plans to achieve this also for all the games in Chessbase where both players are above 2500 Elo.

This paper shows that it is extraordinary helpful to have an AI system that is well above the best human beings: one can very precisely appraise human behavior, and one can compare the performances of people who lived at different eras. Studying the capacities of an individual is therefore made with accuracy and completeness, incomparably better than with multiple-choice tests. Who knows, in some distant future, AI students will perhaps compare in their thesis mathematical geniuses such as Euclid and PoincarÃ©!