Generative AI Apparently Also Sucks at Chess

While some continue to say that AI is taking over everything, we continue to see examples of why it’s not. In this case, the game of Chess.

One of the many claims we continue to hear is how AI is this omniscient technology that is simply better at everything than human beings. Claims like how it is taking over your jobs, causing humanity to go extinct, how it’s going to make it impossible to tell what is real and what is not, that AI has already achieved sentience, etc. get thrown around a lot. This is specifically in reference to generative AI such as ChatGPT.

Yet, whenever I confront some of these people who truly believe that AI is going to take over everything, the claims quickly get retreated back to how it’s going to take over everything. Sure, it hasn’t yet taking over everything, but give it time! It’ll take over everything before you know it. So, the natural question then becomes how long people think it will take. Those claims vary widely. Sometimes it’s weeks away, sometimes it’s within a few months, and others are before the end of the year. The punchline? I’ve been hearing these claims since at least late 2023. Have you ever saw a story and get reminded of the long running claims that nuclear fusion is just a few years away? I ask because this is exactly the same vibe I get from the AI story in general.

Every single time I see an AI doomer say how the AI takeover is nigh, I rarely see any pushback from media outlets. One of the ways to push back is to have a general understanding of where AI actually is at. My personal favourite go-to aspect of AI is to remind people that generative AI was designed to make things like text sound like they are written by a human being. There is a huge fundamental difference between an AI doing tasks for the purpose of sounding like they are being done by a human being and an AI that is actually properly doing these tasks. This is an element that continues to get lost in these debates and a lot of people’s understanding of AI. The hilarious thing about that is that some people out there simply forget this aspect and treat AI as if it is this omniscient foolproof technology that can do anything.

As a result of this, I get to witness a bottomless well of hilarity where people are harnessing the power of AI to unleash it onto the world to get things done and make tonnes of money in the process. For instance, in the world of law, I got to see a case where lawyers used ChatGPT to write their legal briefs only to have the judge ask the lawyers afterwards why they were using fake cases for their legal briefs. I also got to see the downfall of DoNotPay, a legal AI supposedly designed to write legal documents.

From there, I also got to see AI take on journalism which was also equally hilarious. For instance, there was the issue of CNET using AI to write their news articles only to find out that those articles were highly error prone. Gannet, for their part, tried to hide the fact that they used AI for their journalism only to see that backfire in an even more spectacular fashion.

Then there was the case where a “revolutionary” AI was predicting that the next Nintendo console was going to be released in September… of 2024. An epic fail to be sure.

Also, let’s not forget the AI that falsely reported that a District Attorney was charged with murder as well.

Some companies decided to walk back the expectations and, instead, chose to use AI to summarize existing articles instead (don’t most news sites offer a summary of their article at the top like mine already?). The hilarity didn’t stop just because the expectations were dialed back a bit. For instance, Apple Intelligence falsely summarized that Luigi Mangione had shot himself. Then there was Google’s AI Overview that recommended that users eat rocks and use glue to prevent the cheese from sliding off of the pizza.

Then, of course, there was the most recent thing we saw where the AI, Devin, was trying to replace software engineers. This AI was tested with different tasks and managed to pull in a success rate of a whopping… 15%. Ouch indeed.

Yes, watching AI fail at everything really is an endless well of hilarity and it makes the AI doomers look even dumber. So, what juicy fail do I bring to you today? AI failing at the game of Chess. Now, I know what some people out there might be thinking right now: “Wait, I thought computers have already beaten humanity at Chess.” You’d be correct. There are computer programs and AI that performs at Chess far better than humanity. Examples include the famous Deep Blue, Alpha Zero, Stockfish, and many others. There are computer programs already out there that are simply better at Chess than humans. That’s not in dispute.

So, the natural question then becomes, “well, if computers are already better at Chess than humans, why is generative AI bad at Chess?” The simple reality is that programs like Stockfish and Alpha Zero were specifically designed to figure out Chess. They are tailor made for the game. Generative AI, however, aren’t necessarily designed specifically to play Chess. Now, can you play chess with the Generative AI? Sure. After all, it is designed to sound like you are talking to a person. However, as I’ve already stated, pretending to sound like a person doing something and actually doing something accurately are two completely different things. This is exactly how you get to this point where Generative AI is doing a bad job at playing Chess.

So, what exactly went wrong? Well, a YouTuber by the name of GothamChess – a well known name in the Chess world – decided to start putting the AI chatbots to the test to see how well they’d perform. The results were quite hilarious. In one video, GothamChess decided to test the skills of recent entry to the world of AI, DeepSeek. It’s a 20 minute video that you can see on YouTube or in the embed below:

If you don’t have 20 minutes to spare, the gist of it is that DeepSeek cheats… a lot. This is in the form of obviously illegal moves. Whether it is moving pieces in ways that don’t conform to the basic rules of Chess, inventing the existence of pieces part way through the game, or more, the AI simply does whatever it wants while tossing the rule book out the window. If that weren’t enough, the AI also makes blundering moves, seemingly forgetting what the positions of the pieces are in the process.

Now, you might be thinking that DeepSeek is new, there’s probably bugs in the system to work out, and so on. You can’t expect an AI to be perfect right out of the gate, right? Well, the problem is, DeepSeek is far from the only one that does these things. In another video, GothamChess pitted DeepSeek against ChatGPT and the results were basically two AI chatbots cheating at Chess.

What’s hilarious in all of this is that there’s also a lot of exposition on the part of the bots explaining why their moves are so brilliant – even to the point where they falsely believe they won the game in the process from time to time. So, if you thought that if you used ChatGPT to play Chess – a chatbot that has been around for much longer – then you would get better results, well, think again.

These aren’t the only two video’s were GothamChess tested AI chatbots on their Chess prowess, but I think the point has been made here.

I’ve been quite mystified as to how people honestly believe generative AI like ChatGPT is going to take over everything. I mean, I see the wild claims all the time, yet I’m blessed with an endless supply of examples where generative AI fails miserably every single time. What’s more, after more than a year of covering the developments of AI off and on, I’m not seeing it get any better, either. For as long as people continue to believe that AI is somehow infallible, I’ll happily find more and more examples of why it’s not – all the while pointing and laughing at the claims.

Drew Wilson on Mastodon, Twitter and Facebook.