New AlphaZero Paper Explores Chess Variants
In a new paper from DeepMind, this time co-written by 14th world chess champion Vladimir Kramnik, the self-learning chess engine AlphaZero is used to explore the design of different variants of the game of chess, with different sets of rules.
The paper is titled Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess and has been written by Deepmind's Nenad Tomasev, Ulrich Paquet, and Demis Hassabis, together with Kramnik. The Russian grandmaster has been working with DeepMind since last year, when we published his article about No-Castling chess.
On Friday, September 18 Chess.com hosted a round-table discussion with GM Vladimir Kramnik, IM Danny Rensch, and researchers of Deepmind discussing their latest paper in which AlphaZero explores chess variants. Here it is for replay:
In this new paper (here in PDF), No-Castling chess is one of nine chess variants that have been looked at. AlphaZero functioned as the tool to simulate decades of human play in a matter of hours, which made it possible to see what games between strong human players in these variants would potentially look like.
Game design, in general, is complicated. Coming up with a new chess variant that actually works, is not easy either. The researchers write: "Designing engaging and balanced sets of game rules is non-trivial, due to difficulties in assessing the consequences of individual changes on game dynamics and appeal."
Chess.com's Chief Chess Officer, International Master Danny Rensch, reviewed the paper in detail during the embargo period which Chess.com had privileged access to the games, and he created this quick breakdown (as well as several more videos to come!) of some of the high level takeaways from the report, as well as his own entertaining ranking of the "Top Ten List" of the variants played by AlphaZero here:
By using the reinforcement learning system AlphaZero, the researches wanted to show the potential of AlphaZero to be used "as a tool for creative exploration and design of new chess variants."
The nine variants that were tested by AlphaZero
Variant | Primary rule change | Secondary rule change |
No-castling | Castling is disallowed throughout the game |
- |
No-castling (10) | Castling is disallowed for the first 10 moves (20 plies) mention plies = half moves |
- |
Pawn one square | Pawns can only move by one square |
- |
Stalemate=win | Forcing stalemate is a win rather than a draw |
- |
Torpedo | Pawns can move by 1 or 2 squares anywhere on the board. En passant can consequently happen anywhere on the board. |
- |
Semi-torpedo | Pawns can move by two square both from the 2nd and the 3rd rank |
- |
Pawn-back | Pawns can move backwards by one square, but only back to the 2nd/7th rank for White/Black |
Pawn moves do not count towards the 50 move rule |
Pawn-sideways | Pawns can also move laterally by one square. Captures are unchanged, diagonally upwards |
Sideway pawn moves do not count towards the 50 move rule |
Self-capture | It is possible to capture one's own pieces |
- |
For each variant, AlphaZero was trained from scratch and then played a large number of games against itself: 10,000 games at one second per move, and another 1,000 with one minute per move. Based on these games, both a quantitative and a qualitative assessment was done in the paper.
Quantitative assessment
For each variant, the expected draw rate and first-move advantage, expressed as the expected score for White, were determined. Expectedly, these were different for the different time controls and for all variants it was the case that there would be more draws for the one-minute games compared to the one-second games.
"This seems to suggest that the starting position might be theoretically drawn in these chess variants, like in classical chess, and that some of the variants are simply harder to play, involving more calculation and richer patterns," the researchers write.
Variant | Training | 1 sec | 1 min |
Classical | 54.10% | 51.80% | 50.80% |
No castling | 55.70% | 53.30% | 51.30% |
No castling (10) | 52.50% | 51.00% | 50.40% |
Pawn one square | 53.50% | 51.60% | 50.30% |
Stalemate=win | 54.90% | 53.00% | 51.10% |
Torpedo | 57.00% | 56.80% | 54.00% |
Semi-torpedo | 54.70% | 53.60% | 50.90% |
Pawn-back | 53.00% | 51.10% | 50.10% |
Pawn-sideways | 54.80% | 52.80% | 50.50% |
Self-capture | 54.20% | 52.60% | 50.80% |
The paper also illustrates how the same opening can lead to vastly different outcomes under different chess variants. This was done by forcing AlphaZero to play the Dutch defense, the Chigorin defense, the Alekhine defense, and the King's Gambit in 1,000 games each, in all variants except for Pawn one square.
For the variants that have additional move options on top of the classical ones (like self-capture), it was analyzed how often these options were utilized by AlphaZero. It turned out that the non-classical moves were used in a large percentage of games, often multiple times per game, in each of the variants. "This suggests that the new options are indeed useful, and contribute to the game," the researchers write.
Another interesting segment from the paper is the approximations for piece values in each of the variants. These were computed from a sample of 10,000 fast-play AlphaZero games:
Variant | p | N | B | R | Q |
Classical | 1 | 3.05 | 3.33 | 5.63 | 9.5 |
No castling | 1 | 2.97 | 3.13 | 5.02 | 9.49 |
No castling (10) | 1 | 3.14 | 3.40 | 5.37 | 9.85 |
Pawn one square | 1 | 2.95 | 3.14 | 5.36 | 9.62 |
Stalemate=win | 1 | 2.95 | 3.13 | 4.76 | 8.96 |
Self-capture | 1 | 3.10 | 3.22 | 5.34 | 9.42 |
Pawn-back | 1 | 2.65 | 2.85 | 4.67 | 9.39 |
Semi-torpedo | 1 | 2.72 | 2.95 | 4.69 | 8.3 |
Torpedo | 1 | 2.25 | 2.46 | 3.58 | 7.12 |
Pawn-sideways | 1 | 1.8 | 1.98 | 2.99 | 5.92 |
Qualitative assessment
Besides their quantitative analysis, the researchers also wanted to answer more subjective questions about the aesthetic value of the types of positions, moves, and patterns that arise in the different variants. This is where Kramnik enters.
In order to try and evaluate which of the variants could be most interesting to play for humans, the former world champion characterizes the typical patterns, motifs, and even style of AlphaZero's play.
As brought forward in last year's article, Kramnik calls No-castling chess a potentially exciting variant, "given that king safety is often compromised for both players, allowing for simultaneous attacking and counter-attacking and the equality, when reached, tends to be dynamic in nature rather than 'dry.' The multitude of approaches to evacuate the king, and their timing, adds complexity to the opening play."
Kramnik feels that not allowing castling before move 10 isn't different enough from classical chess; AlphaZero tends to castle in most games anyway. He feels the same about Stalemate=win chess, where only certain endgames are evaluated differently.
The most complicated variant, according to Kramnik, is Pawn-sideways chess as it results in "patterns that are at times quite 'alien' when one is used to classical chess. The pawn structures become very fluid and it is impossible to create permanent pawn weaknesses."
Examples
Below is an example game from each of the nine variants with excerpts of Kramnik's comments given in the paper. The last five are given as embedded videos since our game viewer cannot handle the alternative rules! (We're working on that.)
No-castling
"One of the main advantages of no-castling chess is that it eliminates the nowadays overwhelming importance of the opening preparation in professional chess, for years to come, and makes players think creatively from the very beginning of each game," writes Kramnik. "This would inevitably lead to a considerably higher amount of decisive games in chess tournaments until the new theory develops, and more creativity would be required in order to win. These factors could also increase the following of professional chess tournaments among chess enthusiasts."
No-castling (10)
"The main purpose of the partial restriction to castling, as a hypothetical adjustment to the rules of chess, would be to sidestep opening theory," writes Kramnik. "As such, it is aimed at professional chess as an option to potentially consider. The game itself does not change in other meaningful ways, and AlphaZero usually aims at playing slower lines where castling does indeed take place after the first 10 moves."
Pawn one square
"The basic rules and patterns are still mostly the same as in classical chess, but the opening theory changes and becomes completely different," writes Kramnik. "Intuitively it feels that it ought to be more difficult for White to gain a lasting opening advantage and convert it into a win, but since new opening theory would first need to be developed, this would not pertain to human play at first. In most AlphaZero games one can notice the rather typical middlegame positions arise after the opening phase."
Stalemate=win
Two knights vs. a lone king is now a win.
"Looking at the games of AlphaZero, it seems that there are enough defensive resources in most middlegame positions that certain types of inferior endgame positions, now possible under this rule chance, could be avoided and defended," writes Kramnik. "A strong player can in principle learn to navigate to these positions to take advantage of them, or find ways to escape them."
Torpedo
The pawns become quite powerful in Torpedo chess," writes Kramnik. "Passed pawns are in particular a very strong asset and the value of pawns changes based on the circumstances and closer to the endgame. All of the attacking opportunities increase and this strongly favours the side with the initiative, which makes taking initiative a crucial part of the game. Pawns are very fast, so less of a strategical asset and much more tactical instead. The game becomes more tactical and calculative compared to standard chess."
Semi-torpedo
"Semi-torpedo chess seems to be more decisive than classical chess, and less decisive than Torpedo chess," writes Kramnik. "It is an interesting variation, to be potentially considered by those who like the general middlegame flavor of Torpedo chess, but are unwilling to abandon existing endgame theory."
Pawn-back
The Pawn-back version of chess allows for more fluid and flexible pawn structures and could potentially be interesting for players who like such strategic maneuvering," writes Kramnik. "Given that Pawn-back chess offers additional defensive resources, winning with White seems to be slightly harder, so the variant might also appeal to players who enjoy defending and attackers looking for a challenge."
Pawn-sideways
"This is the most perplexing and “alien” of all variants of chess that we have considered," writes Kramnik. "Even after having looked at how AlphaZero plays Pawnside chess, the principles of play remain somewhat mysterious – it is not entirely clear what each side should aim for. The patterns are very different and this makes many moves visually appear very strange, as they would be mistakes in Classical chess. (...) This variant of chess is quite different and at times hard to understand, but could be interesting for players who are open to experimenting with few attachments to the original game!
Self-capture
"I like this variation a lot, I would even go as far as to say that to me this is simply an improved version of regular chess," writes Kramnik. (...) "Regardless of its relatively minor effect on the openings, self-captures add aesthetically beautiful motifs in the middlegames and provide additional options and winning motifs in the endgames. (...) To conclude, I would highly recommend this variation for chess lovers who value beauty in the game on top of everything else."
The 97-page paper includes many more games and explanations from Kramnik that are both instructive and fun. You can download it here in PDF.