Just how strong is ChessUp bot level 12 "Stockfish elo: 1500" in actuality? My investigation

OmniusReaper556 · December 2, 2025, 10:05pm

So I thought it’d be interesting to determine how strong the first ChessUp 2 bot that is based on Stockfish, bot level 12 “Stockfish elo: 1500” is. I first started out by testing it against the chess.com bots and discovered that the ratings for the chess.com bots are much higher than they should be.

ChessUp 2 level 12 managed to mow through the chess.com bots up to the ones rated in the 2500 range. It had some difficulty with the Kosteniuk (“2561”) bot but managed to win a 6 game match against it, and after drawing the first game against the Naroditsky (“2650”) bot I decided to expedite things and moved it up to the Judit Polgar bot (“2735”). The Judit bot made short work of the ChessUp 2 level 12 bot easily winning the three games they played.

So after that I thought it’d be interesting if I use Fritz 20’s rated game feature to see what it thinks the ChessUp 2’s level 12 bot’s rating is. Fritz 20’s rated game feature has you playing against a weakened Fritz 20 which can make its moves within 1 second each and lets you choose its strength between 1040 and 2440. The 2440 setting brutally outplayed CU2 L12 so from then on I tried to pair it against lower opponents.

in 20 games the highest rating the CU2 L12 achieved was 1967 and with a performance of 10.5 points out of 20 games against opponents with an average rating of 1945 the CU2 L12 ended up with a rating of 1924. As I’m nowhere near that strength I cannot verify it but it sounds like a reasonable estimate, and if that estimate is correct it’d mean it’s over 400 elo stronger than what the “1500” label claims.

Jeff · December 2, 2025, 10:47pm

Cool - would be curious what level 11 is like when compared to Fritz levels.

Jeff · December 2, 2025, 11:09pm

Then with that info we can make the jump smaller between 11 and 12.

OmniusReaper556 · December 3, 2025, 1:17am

That’ll be on my to-do list but I think it might be a little bit trickier due to the current endgame bug I discovered for bots level 11 and below in which they don’t know how to mate with queen vs lone king, this could cause it to allow its opponent to escape with a draw. I discuss the bug in this thread: Major endgame flaw with the hardwired ChessUp bots (levels 1 to 11)?

Ian · December 4, 2025, 9:22am

Completely unscientific and anecdotal but I have always found Chess.com bots much easier than an equivalent CU2 offline bot. Im roughly in the 1200 range. Can beat chess.com bots up to around 1500 but on CU2 1100 is a good match.

Definitely feel that Chess.com bots are overrated make blunders and mistakes no human would ever make at those levels. CU2 feels more accurate.

OmniusReaper556 · December 7, 2025, 1:57pm

I used the same Fritz 20 rated game feature to test ChessUp level 13 (Stockfish elo: 1600) and ended up with a rating of 2053 after 20 games against an average opponent rating of 2113. While this is very close to the +100 elo you’d expect from going between ChessUp level 12 and 13 they both seem to be much stronger than the “Stockfish elo: 1500 and 1600” descriptions.

However it is possible the Fritz 20 rated game feature is overestimating the rating?

I think it is more likely that the estimate from the Fritz 20 rating tool is closer to the truth. Here’s one of the games in which ChessUp level 13 managed to get 3 brilliancies according to the chess.com Game Review at max settings, and an 84% accuracy score vs black’s 78.8%:

[Event “Rapid 60min”]
[Site “?”]
[Date “2025.12.06”]
[Round “?”]
[White “ChessUp level 13”]
[Black “2200”]
[Result “1-0”]
[ECO “D45”]
[WhiteElo “1750”]
[BlackElo “2200”]
[WhiteFideId “-1”]
[BlackFideId “-1”]
[PlyCount “147”]
[GameId “2252103756472338”]
[TimeControl “3600”]

c4 {0} e6 {0} 2. Nc3 {9} d5 {0} 3. e3 {7} Nd7 {0} 4. Nf3 {9} c6 {0} 5. a3 {10} Ngf6 {0} 6. d4 {9} Be7 {0} 7. b3 {7} b6 {0} 8. Bb2 {7} dxc4 {0} 9. bxc4 {15} Bb7 {0} 10. e4 {7} e5 {0} 11. Be2 {7} exd4 {0} 12. Nxd4 {13} O-O {0} 13. Nf5 {10} a6 {0} 14. Qd3 {8} Re8 {0} 15. f4 {8} g6 {0} 16. Nh6+ {8} Kf8 {0} 17. Rd1 {9} Kg7 {0} 18. Qh3 {9} b5 {0} 19. e5 {7} Qb6 {0} 20. Nxf7 {12} Bc8 {0} 21. f5 {7} Kxf7 {0} 22. Rxd7 {12} Bxd7 {0} 23. exf6 {11} Bxf6 {0} 24. Qxh7+ {12} Kf8 {0} 25. fxg6 {10} Re7 {0} 26. g7+ {8} Bxg7 {0} 27. Rf1+ {9} Rf7 {0} 28. Rxf7+ {11} Kxf7 {0} 29. Qxg7+ {14} Kxg7 {0} 30. Na4+ {9} Qd4 {0} 31. Bxd4+ {16} Kg6 {0} 32. Nc5 {12} Be8 {0} 33. Bd3+ {12} Kh6 {0} 34. Kd2 {8} Bg6 {0} 35. Bf1 {10} Rd8 {0} 36. Ke3 {9} Bf7 {0} 37. g3 {8} Bxc4 {0} 38. Bxc4 {12} bxc4 {0} 39. a4 {8} Kh5 {0} 40. h3 {10} Re8+ {0} 41. Ne4 {9} Rf8 {0} 42. Bc3 {8} Rf1 {0} 43. Nf6+ {9} Kg5 {0} 44. a5 {7} Rxf6 {0} 45. Bxf6+ {14} Kxf6 {0} 46. Kd4 {8} Kf5 {0} 47. h4 {7} Kg6 {0} 48. g4 {7} Kf6 {0} 49. Kxc4 {10} Ke5 {0} 50. h5 {9} Ke6 {0} 51. g5 {9} Kf5 {0} 52. h6 {10} Kg6 {0} 53. Kc5 {9} Kxg5 {0} 54. h7 {11} Kf4 {0} 55. h8=Q {10} Ke3 {0} 56. Qc8 {11} Kd3 {0} 57. Qa8 {11} Kc2 {0} 58. Qxa6 {13} Kb2 {0} 59. Qf1 {10} Kb3 {0} 60. Qh1 {8} Kb2 {0} 61. Qf1 {8} Kc2 {0} 62. a6 {8} Kd2 {0} 63. Qh1 {9} Ke3 {0} 64. Qh6+ {12} Ke2 {0} 65. Qf6 {10} Kd1 {0} 66. Qe6 {9} Kc2 {0} 67. Qxc6 {11} Kc3 {0} 68. a7 {11} Kd3 {0} 69. a8=Q {8} Ke3 {0} 70. Qh1 {9} Kd3 {0} 71. Qe1 {8} Kc2 {0} 72. Qf2+ {8} Kb3 {0} 73. Qh2 {8} Kc3 {0} 74. Qf3# {10} 1-0

I must say though that a few times during my testing ChessUp level 13 played strangely horrific in the opening, like in this game where it looks like it was tilted:

[Event “Rapid 60min”]
[Site “?”]
[Date “2025.12.07”]
[Round “?”]
[White “ChessUp level 13”]
[Black “2027”]
[Result “0-1”]
[ECO “D20”]
[WhiteElo “2033”]
[BlackElo “2027”]
[WhiteFideId “-1”]
[BlackFideId “-1”]
[PlyCount “32”]
[GameId “2252431377506322”]
[TimeControl “3600”]

d4 {0} d5 {0} 2. c4 {10} dxc4 {0} 3. e4 {16} e5 {0} 4. Nf3 {10} exd4 {0} 5. Bxc4 {12} Bb4+ {0} 6. Nc3 {14} dxc3 {0} 7. Qa4+ {14} Nc6 {0} 8. Bxf7+ {14} Kxf7 {0} 9. O-O {13} Qe7 {0} 10. Bg5 {12} Nf6 {0} 11. Rae1 {11} cxb2 {0} 12. e5 {9} Bxe1 {0} 13. Rxe1 {11} h6 {0} 14. a3 {8} hxg5 {0} 15. Nxg5+ {13} Kg6 {0} 16. e6 {14} Kxg5 {0 Leto resigns} 0-1

But more often than not I feel that ChessUp level 13 played solid chess. According to chess.com’s Game Review it scored an average of 85.56% as white and 82.02% as black.

Next up for me is ChessUp level 11 although I’m unsure how well that would go because it currently has no endgame knowledge and can allow draws even if it’s up by a queen against a lone king. I think my best option in this case is to put it against an opponent that is rated 200 elo stronger, but finding that out might be tricky and am not sure if it’s worth the effort since they plan on giving it some endgame knowledge.

OmniusReaper556 · December 7, 2025, 4:00pm

After 4 games in my test of the ChessUp level 11 (“1400”) bot I have decided to terminate the test after this disappointing game where level 11 was completely winning and gifted white with a draw by repetition:

[Event “Rapid 60min”]
[Site “?”]
[Date “2025.12.07”]
[Round “?”]
[White “1403”]
[Black “ChessUp Level 11”]
[Result “1/2-1/2”]
[ECO “C00”]
[WhiteElo “1403”]
[BlackElo “1533”]
[WhiteFideId “-1”]
[BlackFideId “-1”]
[PlyCount “73”]
[GameId “2252477916024850”]
[TimeControl “3600”]

e4 {0} e6 {31} 2. Nf3 {0} d5 {8} 3. Bb5+ {0} c6 {8} 4. Bf1 {0} dxe4 {12} 5. Ng1 {0} e5 {9} 6. Qh5 {0} Qd4 {12} 7. Nc3 {0} Nf6 {17} 8. Qg5 {0} Be6 {15} 9. Qg3 {0} h6 {17} 10. d3 {0} exd3 {19} 11. Bxd3 {0} Nh5 {110} 12. Qe3 {0} Be7 {12} 13. Rb1 {0} g6 {12} 14. h3 {0} Bd7 {15} 15. Ne4 {0} Qd5 {12} 16. b4 {0} Qxa2 {17} 17. c3 {0} Nf4 {15} 18. Qf3 {0} g5 {10} 19. Ne2 {0} Nxd3+ {22} 20. Qxd3 {0} Be6 {13} 21. Be3 {0} Bc4 {17} 22. Qd1 {0} Bxe2 {17} 23. Nd6+ {0} Bxd6 {11} 24. Qxd6 {0} Qxb1+ {13} 25. Kxe2 {0} Qc2+ {10} 26. Ke1 {0} Qxc3+ {15} 27. Kf1 {0} Qa1+ {9} 28. Ke2 {0} Qb2+ {10} 29. Bd2 {0} Qd4 {26} 30. Qc7 {0} Qc4+ {9} 31. Ke3 {0} Qb3+ {8} 32. Ke2 {0} Qc4+ {9} 33. Kd1 {0} Qb3+ {8} 34. Kc1 {0} Qc4+ {9} 35. Kd1 {0} Qb3+ {8} 36. Kc1 {0} Qc4+ {9} 37. Kd1 {0} 1/2-1/2

When the hardwired bots (levels 1 to 11) get updated I’ll start a new test with level 11.

OmniusReaper556 · December 13, 2025, 5:08pm

So I’ve been testing the ChessUp 2 level 13 bot at Lichess against the community bots and after 23 games it has a rapid rating of 2258 and my observation is that it is very close in strength with the SimpleEval bot which currently has a 2234 rapid rating and a 2167 blitz rating.

The question is how does the Lichess rating compare to the estimated rating given by the Fritz rating tool which was 2053? I wish I had done the Lichess testing with blitz time control as the Chessup 2 bots already play fast no matter what time control but even if we were to assume the Chessup 2 level 13 bot is only around 2160 in blitz (I’m assuming in blitz it’d still be around the same strength as the SimpleEval bot) that’s still a 100 elo difference between the two estimates.

But now there’s two estimates putting the ChessUp 2 level 13 bot at over 2000 elo which is significantly higher than the bot’s description of “1600” elo. Perhaps the description should be updated or preferably the bot be weakened without sacrificing its endgame knowledge.

OmniusReaper556 · December 21, 2025, 4:10pm

I have since learned that Lichess ratings are inflated by around 200 Elo compared to over-the-board ratings, so a 2200 Lichess Classical player is roughly equivalent to a 2000 FIDE rating or about 2050 USCF.

So ChessUp 2 level 12 would probably be roughly equivalent to a 1900 FIDE or 1950USCF player, and ChessUp 2 level 13 would be roughly equivalent to a 2000 FIDE or 2050 USCF player.

Jeff · December 21, 2025, 6:48pm

FWIW - Lichess rating are ~500 high vs FIDE/USCF and chess.com is 200~300 high vs FIDE/USCF in the mid-range. Somewhere on the internet there have been some efforts to map the distributions vs one another.

There is no standard absolute scale for Elo. The scores are completely dependent on the player pool and settings for the math. Each major system is slightly different.

Elo is more meant to give a probability of victory for a certain delta rating between 2 players. It is just the root math between how ratings are adjusted post match.

It is still valuable to measure progress for players within one system. Bot ratings are another topic completely because people tend to forget they should win 50% and lose 50% of the time against same Elo. But people tend to mistake “I beat a bot once so I am now > than that Elo”. They also forget the repeated bot games they start and abandon after a mistake. Platforms also tend to exaggerate bot elos (for the enjoyment of the users). We try to be more accurate and try to adhere to ~chess.com elo.

dna2rna · December 25, 2025, 11:14pm

My little boy is USCF 1200 and he concurs that the stockfish bots are definitely more challenging as a 1300-1500 than the chess.com bots in that range. I’ve watched him play both and he can easily knock out the chess.com bots but with the Chessup stockfish bots, he’s like “wow, it’s developing a trap I’ve never seen” and “yeah this 1500 level is too much for me.”

OmniusReaper556 · December 27, 2025, 5:54pm

Thank you for the testimonial. The past few days I’ve been testing a chess computer, Millenium Chess Genius Pro 2024, against the Lichess community bots in rapid chess using its 15 second per move setting and after 20 games it is currently rated 2237 which is 21 points lower than ChessUp 2 Level 13’s Lichess 2258 rapid rating so this is yet more evidence that the ChessUp 2 level 12 and level 13 bots are way too strong at the moment in my opinion and I feel that either they should be weakened or at the very least their description should be updated to something more reasonable.

Imagine if you were 1500 USCF and you think “hey the ChessUp 2 level 12 bot description says “1500 elo this should be a good matchup for me. Then you get blasted away over and over and unless you’ve heard that the bot is severely underrated and is actually playing closer to the 1900-2000 USCF level you’d probably feel bad about your performance. And because it is so much stronger I think it will be difficult to learn anything from it, it’d be much faster and easier to learn against an opponent that is closer in strength to you.

Criptix · December 27, 2025, 10:08pm

The bots are being revised and will become bots with a personality, like those on chess.com. This should be released in the coming months, I think. Then this problem will be solved.

OmniusReaper556 · December 27, 2025, 11:16pm

I love that idea, looking forward to it!

Topic		Replies	Views
How good are the Chessup bots? General	6	578	January 16, 2026
Request: Bots adapted for beginner kids (AI vs ChessKid/Chess.com) General	16	1076	July 22, 2025
Chess.com bots on chessup 1 General	2	118	August 31, 2025
Chess.com Bot Games Not Saved on ChessUp 2 General	2	318	December 16, 2024
Major endgame flaw with the hardwired ChessUp bots (levels 1 to 11)? ChessUp 2	1	172	December 3, 2025

Just how strong is ChessUp bot level 12 "Stockfish elo: 1500" in actuality? My investigation

Related topics