Professor Finds Nakamura’s Winning Streaks Statistically Normal, Kramnik Dismisses It

TarjeiJS

Updated: Sep 7, 2024, 10:42 AM | 246 | Chess.com News

A new study by one of the world's leading statisticians has found that GM Hikaru Nakamura's impressive streaks against high-level opponents on Chess.com are well within the expected probabilities. GM Vladimir Kramnik strongly criticizes the report, dismissing it as "completely ridiculous" and accusing Chess.com of "obvious manipulation."

[This story was updated September 7 where Rosenthal addresses Kramnik's criticism. See more below.]

Jeffrey S. Rosenthal is a Professor of Statistics at the University of Toronto and a globally recognized expert in probability and statistics. He is the author of several acclaimed books, such as Struck by Lightning: The Curious World of Probabilities, and is well-known for his ability to explain complex statistical concepts to the public. He also has a lifelong passion for chess and enjoys the game as a hobby.

Rosenthal's Conclusion: "Not Particularly Surprising"

The Canadian professor has released a new 12-page paper titled Probabilities of Streaks in Online Chess, which was done in response to a suggestion by Chess.com. In the study, he meticulously analyzed Nakamura's streaks based on the American grandmaster's 57,421 games on the platform from 2014 to 2024. Rosenthal noted that his study should be viewed as an investigation of unlikely streaks, and not the broader issue of cheating in online chess.

In his detailed analysis of Nakamura's performance, Rosenthal focused on one of the streaks in online blitz where he scored 45.5/46: "This statistical analysis indicates that Hikaru’s online chess winning streaks are not particularly surprising. His recent controversial streak of length 46 is well within expected levels," Rosenthal concluded in his study.

This statistical analysis indicates that Hikaru’s online chess winning streaks are not particularly surprising. His recent controversial streak of length 46 is well within expected levels.

—Jeffrey Rosenthal, Professor in Statistics, University of Toronto

Nakamura, currently the world number two in classical and blitz, came under scrutiny by Vladimir Kramnik last year, with the former world champion questioning his winning streaks, saying somewhat cryptically: "I believe that everyone would find this interesting." Kramnik called for further investigation of the probability of such a streak, and while not explicitly accusing Nakamura of cheating, the global chess community has interpreted Kramnik's comments as a clear insinuation of foul play.

Kramnik wrote in a blog post on Chess.com:

Having checked Hikaru's statistics carefully, I have found NUMEROUS low probabilities performances both of him and some of his opponents. Some of which have EXTREMELY low mathematical probability, according to mathematicians. Way below one percent, according to the calculations of those professional mathematicians.

Rosenthal: “Things That Seem Rare Are Not Necessarily Rare”

However, Rosenthal's study found that Nakamura’s streaks, while exceptional, fall well within the statistical expectations for a player of his caliber. "Things that seem really rare are not necessarily really rare when you look at them in the right context," Rosenthal explained to Chess.com in an interview.

Things that seem really rare are not necessarily really rare when you look at them in the right context.

—Jeffrey Rosenthal

Jeffrey Rosenthal. Photo: Courtesy of University of Toronto/YouTube

Using the Elo rating system, Rosenthal calculated the likelihood of such streaks occurring naturally. He noted that Nakamura's 45.5/46 streak had a probability of about 1 chance in 830, which is not very unlikely given the number of chances he has had over the course of more than 50,000 games.

A sequence of 57,421 games has about 57,421/46 ≈ 1,248 different non-overlapping independent chances to achieve a streak of that length, so finding one with a probability of 1/830 is actually very likely.

Analysis Of Nakamura's Most Notable Streaks

The study also analyzed the probability for Nakamura's longer winning streaks, defined as a sequence of games with no losses and at most one draw. Rosenthal found a total of 226 streaks that lasted more than 30 games, with the longest 121 games.

Nakamura's average opposition was less than 1600 in those games, and Rosenthal noted that a long streak does not necessarily need to be unlikely: "His probability of scoring at least 120.5 on those 121 games then works out to 1/8.9, which is not particularly unlikely at all."

Among Nakamura's most notable streaks, only two are considerably less likely than his 45.5/46 score, according to Rosenthal's study: "These very unlikely things, one chance in 9,000 or one chance in 11,000 streaks... something like 43% of the time, there was at least one such streak," Rosenthal said, referring to line 4 and 6 in the table below.

Hikaru Nakamura's most unlikely streaks, with probability less than one chance in 500. Graphic: Jeffrey Rosenthal/Probabilities of Streaks in Online Chess.

To further validate the streaks, the professor conducted Monte Carlo simulations, a method that uses repeated random sampling to estimate the likelihood of different outcomes. By simulating thousands of random game sequences based on Nakamura's ratings and the ratings of his opponents, Rosenthal found that the probability of achieving a streak like 45.5/46 was actually quite reasonable.

In each simulation, Rosenthal recorded the length and frequency of winning streaks that matched or exceeded those observed in Nakamura’s actual games. The results provided a distribution of streak lengths and frequencies, which Rosenthal then compared to Nakamura's real performance.

1 / Smallest Streak Probability. Graphic: Jeffrey Rosenthal/Probabilities of Streaks in Online Chess.

Rosenthal noted:

We see from the graph that, while the actual values 11,570.6 and 9,452.1 are larger than many of the simulated maximum 1/probability values, there are also many simulated 1/probability values which are much larger than that. Indeed, the largest simulated 1/probability value is over 284,000, and the mean simulated 1/probability value is over 26,000, and even the median simulated 1/probability value is 10,461.92 which is close to Hikaru’s 11,570.6 value. In fact, in 43 of the 100 simulations (nearly half), the least likely streak is less likely than the observed 1/11570.6 one. This further confirms that Hikaru’s least likely streaks are not surprising over such a long collection of games.

In fact, in 43 of the 100 simulations (nearly half), the least likely streak is less likely than the observed 1/11570.6 one. This further confirms that Hikaru’s least likely streaks are not surprising over such a long collection of games.

—Jeffrey Rosenthal

In his conclusion, the professor noted that even though the raw probabilities of Nakamura's least likely streaks are each about 1/10,000, the chance of achieving such streaks over the course of so many games, is still shown above 10%, and about 43% in Monte Carlo simulations: in other words, not very unlikely.

Having two such notable streaks is somewhat less likely, but still occurs about 18% of the time, well within usual statistical variability. Overall, the streaks observed in Hikaru’s Chess.com record are fairly typical given the ratings of the players over Hikaru’s long record of games.

Kramnik's Response: "Completely Ridiculous"

Vladimir Kramnik is accusing Chess.com of "obvious manipulation" following a study by Jeffrey Rosenthal, Professor in Statistics. Photo: Peter Doggers/Chess.com.

While the findings suggest Nakamura's streaks are statistically normal, Kramnik has dismissed the study in strong terms. Responding to Chess.com's request for a comment to this story, the former world champion sent several emails describing it as "a ridiculous report that just doesn't make any sense", "a joke", and "disgusting".

"With all due respect, this 'research' is just another manipulation. I can stand by my statement and prove it in direct conversation with this gentleman. I think he is misinformed or doesn't know much about chess," Kramnik said of Rosenthal.

With all due respect, this 'research' is just another manipulation. I can stand by my statement and prove it in direct conversation with this gentleman. I think he is misinformed or doesn't know much about chess.

—Vladimir Kramnik, former world champion

He further claimed the professor made at least "five major mistakes" in the study, such as mixing different time controls in analyzing the streaks and using the Elo Rating system to calculate probabilities with Chess.com's Glicko system."It's a basic primitive mistake. It makes the whole research wrong. It's just really ridiculous. Believe me."

Kramnik also accuses Chess.com's CEO Erik Allebest of "obvious manipulation", suggesting he intentionally provided the professor with "completely wrong data". He challenged Rosenthal to a live discussion to debate the study's findings and stated he has already recorded a YouTube video detailing his criticisms in more depth.

[Update September 7:] In an updated version of his study, Rosenthal addressed Kramnik's criticism point by point, offering clarifications and additional analysis. He noted in his conclusion:

To summarise, every statistical analysis requires making some choices regarding definitions, scope, etc. I believe that the choices in this report are all fair and defensible, consistent with the available data, and lead to accurate conclusions.

The professor noted a willingness to review additional calculations provided by Kramnik, if provided to him in writing, but defended his analysis as fair, accurate and consistent with available data.

So far the suggested alternative choices still lead to very similar results, and I do not see any indications that other reasonable choices would lead to different conclusions.

Conclusions Align With Independent Analysis

The findings in Rosenthal's study are supported by conclusions in two independent studies. Kramnik vs Nakamura or Bayes vs p-value, conducted by Shiva Maharaj, Nick Polson, and Vadim Sokolov, used Bayesian methods to analyze Nakamura’s performance. Their analysis concluded that the probability of Nakamura cheating was extremely low.

Another independent analysis, by software engineer Kiril Bobyrev, also concluded that "winning streaks found in top players’ online blitz games on Chess.com are not statistically improbable".

Last week, IM Kenneth Regan, the American professor in statistics, also commented on Kramnik's many posts on X/Twitter: "They're not normalized. He does not do the statistical techniques that are required to establish a benchmark of reference, whereas I have. I have a predictive analytic model, I set expectations, I know the confidence intervals around them. These are basic statistical vocabularies that have been known since the 1700s but absent from his posts," he said during the broadcast of Clash of Blames.

The University of Toronto published their own video with Rosenthal.

Professor Jeffrey S. Rosenthal noted that he did not receive any compensation from Chess.com for his work.

Tarjei J. Svensen

Tarjei J. Svensen is a Norwegian chess journalist who worked for some of the country's biggest media outlets and appeared on several national TV broadcasts. Between 2015 and 2019, he ran his chess website mattogpatt.no, covering chess news in Norwegian and partly in English.

In 2020, he was hired by Chess24 to cover chess news, eventually moving to Chess.com as a full-time chess journalist in 2023. He is also known for his extensive coverage of chess news on his X/Twitter account.

Professor Finds Nakamura’s Winning Streaks Statistically Normal, Kramnik Dismisses It

Backgammon Faces Cheating Scandal As U.S. Player Is Banned

Wall Street Gambit: $5,000 Tickets, Chess Legends, Financial Elite