Are the bot characters’ ELO ratings accurate? - Chess Forums - Page 5

777spleF

May 8, 2024

0

#81

idk

owengarriott

May 8, 2024

0

#82

I agree, #82

RyanZ_MD

May 8, 2024

0

#83

My rating is about 1100 or so, but I can beat every bot 1500 or below no problem

RyanZ_MD

May 8, 2024

0

#84

Edgar_Figaro wrote:

It’s more than just time constraint. I can win fairly quickly and consistently against any of the intermediate bots (1000-1400) and I am 1098. I have beaten up to Arthur (1700) however I do find any bot 1500+ to be less than 50% winrate.

Same here.

p8q

May 13, 2024

0

#85

There's some people here with inflated ego who lied.

For the rest who were honest, It's normal they are confused, because the rating in this site versus bots rating doesn't follow a linear correspondence from 200 to 1500 rating. It's normal 900 rated players are winning 1400 rating bots, but suddenly 1500 rated players are losing 1600 rated bots in 50% of the games. I don't know the reason, but I've seen that in person.

From 1600 bot rating ahead, the equivalence suddenly becomes linear and almost totally accurate, with no more than 50 points error.

For those who said bots rating is inflated, just watch in YouTube Nakamura losing 10 consecutive times vs Nakamura bot (2700 rating).

SliverWoIf

May 13, 2024

0

#86

I struggle against people 1000 rapid or higher but against Isabel I won easily

Zlomek_299

May 13, 2024

0

#87

Well... I don't think so that their ELO ratings are accurate. 1100 ELO bot can be sometimes easier to beat than for example 900 ELO bot

p8q

May 17, 2024

0

#88

DaTrueSliverwolf wrote:

I struggle against people 1000 rapid or higher but against Isabel I won easily

How many games did you play vs Isabel bot and how many you won?

It's easy to be lucky and win a couple of times out of 3 games. A serious test must be playing 100 games and winning at least 60% to claim you actually can beat that bot.

I've seen my friend with 1400 rating here in c.c getting destroyed by Isabel bot over and over again. In 50 games he won only 6 and now he doesn't want to play it anymore, traumatized.

p8q

May 17, 2024

0

#89

Zlomek_299 wrote:

Well... I don't think so that their ELO ratings are accurate. 1100 ELO bot can be sometimes easier to beat than for example 900 ELO bot

For years I have tested engines in arena, playing them each other automatically when I'm not at home. That way i calibrated engines that i wasn't sure about their rating. Later i always find out the calibration was accurate.

You can do the same at home, all software is free and easy to use. You will realize that engines have an accurate rating, with an error margin no more than 50 points.

However, we humans, can vary a lot in strength of play (one day we are stressed, or worried... another day we are happy and inspired...) and it's us who are inaccurate, not the engines.

The reason 1100 could get easier than 900 for you, could be because that game in particular had a different character and that's why you won vs 1100, then vs 900 you got distracted or you were not familiarized with the position in that game in particular.

The best way to know is if you play 100 games vs 900 engine. And 100 games vs 1100 engine. Also you have to play both in same conditions and same day (1 day 2 games vs 900 and 2 games vs 1100. One day play first the 900, then the 1100, next day the reverse, to avoid playing one when you are tired and the other one when you are fresh, mixing the conditions to reduce your human psychological and physical effects. You will see in the results that you won more times vs the lower rating engine.

If one day in the future you do such a test playing vs engines, I'd appreciate if you share here your results

Yadav_Yogesh

May 17, 2024

0

#90

The 1300 elo bot just doesn't feel like 1300. I have played 1300 elo players and they are much better than the bots. On the other hand, the 1300 bot feels like 800.

p8q

May 17, 2024

0

#91

yogeshdas123450 wrote:

The 1300 elo bot just doesn't feel like 1300. I have played 1300 elo players and they are much better than the bots. On the other hand, the 1300 bot feels like 800.

You are right. As i said in post 87:

" the rating in this site versus bots rating doesn't follow a linear correspondence from 200 to 1500 rating"

Also engines are calibrated with FIDE rating, not with this site rating. I don't know the comparison FIDE vs chess.com rating from 200 to 1500. But i know Isabel bot with 1600 rating beats consistently all my chess.com friends who are from 1300-1400 rating in this site. I saw them playing at home, i can confirm. I don't know Isabel vs >1500 chess.com players, i don't have friends offline with that rating to watch them playing.

p8q

May 17, 2024

0

#92

We also have in history hundreds of games in official FIDE tournaments humans vs engines, such as hiarcs and fritz engines in Mercosur tournaments, etc... Plus thousands of games engines vs humans in ICC, where Madchess v1.4 reached a rating of 2100 in classical and 2400 in blitz.

Therefore, you can download Madchess v1.4 for free and use it in arena to calibrate other engines (such as chess.com bots, which are komodo 14.1 at different skill levels). I already did that, and i can confirm everything fits perfectly. But all that is according to FIDE and ICC rating, not according to chess.com ratings.

CCRL engines list is calibrated with FIDE ELO, where you can check hiarcs gets exactly the same rating as it got in official FIDE tournaments vs humans. And in that list you can see Madchess v1.4 is 2200 rating. That means CCRL rating is 100 points higher than ICC rating. And you can compare ICC rating with FIDE.

Babbington

May 17, 2024

0

#93

Is there a reason that the bot Elos are not live (I mean that they don't change as they win or lose against us)?

The puzzle ratings are live I think, so why are the bot ratings just a static thing?

bac4k

May 17, 2024

0

#94

its a time thing against bots you aren't rushed and depending on what setting you have them on you can take back bad moves

Ritterschildt

May 17, 2024

0

#95

I played Black against Aron, hoping to see him play his best chess at 700 Glicko. In real life his playing is more similar to a 500, as I experienced in Livechess. Review gave him 550 Glicko.

psychohist

May 17, 2024

0

#96

p8q wrote:

DaTrueSliverwolf wrote:

I struggle against people 1000 rapid or higher but against Isabel I won easily

How many games did you play vs Isabel bot and how many you won?

It's easy to be lucky and win a couple of times out of 3 games. A serious test must be playing 100 games and winning at least 60% to claim you actually can beat that bot.

I've seen my friend with 1400 rating here in c.c getting destroyed by Isabel bot over and over again. In 50 games he won only 6 and now he doesn't want to play it anymore, traumatized.

Interesting. I win maybe half the time against Isabel bot, usually when I subjectively feel I'm playing better. I wonder if part of the issue is players that play mostly bots versus players that play mostly live opponents.

I mostly play bots. Today I tried a live game for the first time in years because I was intrigued about what was going on. Subjectively I felt i played well for me, partly because with the opponent spending time on moves, I felt more comfortable spending time on moves.

But the interesting thing was, although I was rated 1221 after the game, the game analysis said I played at a 1350 level. (and the opponent at a 1450 level; we drew) So the computer assigned rating is more in line with the bot ratings than my live rating was. Whem combined with things like getting to know the bots and the best way to play against each of them, this could account for much of the difference.

So why the difference between my actual rating and the computer estimated rating? Maybe chess.com computers just tend to rate things high. But there's another conjecture: maybe people who mostly play bots tend to get better, but their ratings don't change since playing bots doesn't normally change your rating. So maybe players who mostly play bots tend to be better than their live ratings indicate.

Ritterschildt

May 17, 2024

0

#97

As an amateur, I must learn to stay cool under time pressure. When I play the Bots I'm relaxed and confident, but as soon I play a real opponent (feeling the psychological tension); I start flipping out moves like a ticket-vendor on a sold-out Rock show.

Playing well under pressure is my current goal!

p8q

May 17, 2024

0

#98

Babbington wrote:

Is there a reason that the bot Elos are not live (I mean that they don't change as they win or lose against us)?

The puzzle ratings are live I think, so why are the bot ratings just a static thing?

It was like that around 5 years ago, but low rated people got Nakamura bot from 2750 down to 1800 rating (don't ask me how, but you can guess). After that experience chess.com decided to let them fixed and not affecting to human ratings.

There's an old video in YouTube where you can see Nakamura playing vs Nakamura bot, which is 1800 in the video, and he lost 10 consecutive times because the bot real rating was 2750. And Nakamura says in the video several times while he plays: "this bot is not 1800".

p8q

May 17, 2024

0

#99

To @psychohist and @Ritterschildt

Time control is key: playing vs bots in challenge mode in this site there's no time pressure. The difference playing engines blitz vs classical is around 300 points.

@psychohist is almost 1300 rating. Isabel is 1600 rating. So it's expected he beats Isabel in classical TC 50% of the time.

Also @Ritterschildt pointed out another key issue: psychology. Playing vs bots there's no pressure, because rating points are not at stake, no medals, no tournament, your are calmed down and that makes you play much better. Therefore, your rating vs bots should be much higher only for this reason. That, plus time control, both factors combined should make you play hundreds of points stronger vs bots.

About the game analysis, or game review, it's based on the character of the game. If the game wasn't complicated tactically, was dull, etc... It will give you a huge accuracy and rating estimation. Even bigger than GM level. But that's not because we played at GM strength, that's because we didn't get complicated positions during the game. Also trading all pieces at first sight, simplifying fast to the endgame, will usually give as a result high accuracy (low centipawn loss) in the analysis of the game. Also the estimated rating will be provided according to the rating of players: a game 1300 vs 1300 players will give around 1300 rating performance estimation (sometimes 1100 sometimes 1500... but always around 1300 unless the game was a disaster). And that same game (all moves exactly the game) played GM vs GM will give as a result 2600 rating performance. So, don't trust that estimation. It's garbage

psychohist

May 17, 2024

0

#100

p8q wrote:

To @psychohist and @Ritterschildt

Time control is key: playing vs bots in challenge mode in this site there's no time pressure. The difference playing engines blitz vs classical is around 300 points.

@psychohist is almost 1300 rating. Isabel is 1600 rating. So it's expected he beats Isabel in classical TC 50% of the time.

Also @Ritterschildt pointed out another key issue: psychology. Playing vs bots there's no pressure, because rating points are not at stake, no medals, no tournament, your are calmed down and that makes you play much better. Therefore, your rating vs bots should be much higher only for this reason. That, plus time control, both factors combined should make you play hundreds of points stronger vs bots.

About the game analysis, or game review, it's based on the character of the game. If the game wasn't complicated tactically, was dull, etc... It will give you a huge accuracy and rating estimation. Even bigger than GM level. But that's not because we played at GM strength, that's because we didn't get complicated positions during the game. Also trading all pieces at first sight, simplifying fast to the endgame, will usually give as a result high accuracy (low centipawn loss) in the analysis of the game. Also the estimated rating will be provided according to the rating of players: a game 1300 vs 1300 players will give around 1300 rating performance estimation (sometimes 1100 sometimes 1500... but always around 1300 unless the game was a disaster). And that same game (all moves exactly the game) played GM vs GM will give as a result 2600 rating performance. So, don't trust that estimation. It's garbage

While in principle time control should be key, in my case I'm impatient and find it difficult to use all the time even in rapid. If I could be patient enough to take advantage of classical time control, I would probably play better, but I'm not.

The average rating for my live game experiment was 1239. The average rating in the analysis was 1500. That's a significant difference. Either we played better than our numerical rating, or chess.com analysis overestimates ratings.

I agree the character of the game can affect things, because an uncomplicated game can make it easier to find the best move and not make a mistake. I'm still skeptical about reaching grandmaster levels of accuracy by the analysis rating at a sub-1500 level.