ARTICLE AD
On Tuesday afternoon, Anthropic launched Claude Plays Pokémon on Twitch, a live stream of Anthropic’s newest AI model, Claude 3.7 Sonnet, playing a game of Pokémon Red. It’s become a fascinating experiment of sorts, showcasing the capabilities of today’s AI tech and people’s reactions to them.
AI researchers have used all sorts of video games, from Street Fighter to Pictionary, to test new models — often more for amusement than utility. But Anthropic said that Pokémon proved to be a useful benchmark for Claude 3.7 Sonnet, which can effectively “think” through the sorts of puzzles the game contains.
Like OpenAI’s o3-mini and DeepSeek’s R1, Claude 3.7 Sonnet can “reason” its way through tough challenges, like playing a video game designed for children. While the model’s non-reasoning predecessor, Claude 3.5 Sonnet, failed the very beginning of Pokémon Red — exiting the player’s home in Pallet Town — Claude 3.7 Sonnet managed to win three gym leader badges.

The newest Claude still runs into trouble, though. Hours into the Twitch stream, the model was deterred by a rock wall, which it couldn’t walk through no matter how hard it tried.
One Twitch user summed up the situation this way: “who would win, a computer AI with thousands of hours put into programming it, or 1 rock wall?”
Eventually, Claude realized that it could navigate around the wall.
On the one hand, it’s frustrating to watch Claude traverse Pokémon Red with the speed of a Slowpoke, reasoning through each and every step with excruciating contemplation. Yet it’s also oddly compelling. The left of the stream shows Claude’s “thought process,” while the right shows real-time gameplay.
At one point, Claude attempted to locate Professor Oak inside his laboratory, but got confused, because there were other NPCs in the scene.
“I notice a new character has appeared below me — a character with black hair and what appears to be a white coat at coordinates (2, 10),” Claude wrote. “This might be Professor Oak! Let me go down and talk to him.”
Claude then proceeded to mistakenly talk to an NPC other than the Processor — an NPC the model had spoken with several times before. Some of the thousand-odd people in the Twitch chat started to get antsy. Others, particularly those who’d been watching the stream for more than a few minutes, were less worried.
“Guys chill,” one person wrote in the chat. “Before we exited and entered Oak’s lab like 10 times before understanding how to move on.”

For longtime Twitch users, the format of Anthropic’s stream might feel nostalgic. Over a decade ago, millions of people tried to play Pokémon Red at once in a first-of-its-kind online social experiment called Twitch Plays Pokémon. Each user could control the player character via Twitch chat, resulting in predictably chaotic gameplay.
Some AI researchers have cited Twitch Plays Pokémon as an inspiration for their work. In October 2023, Seattle-based software engineer Peter Whidden published a YouTube video detailing how he trained a reinforcement learning algorithm to play Pokémon. His AI spent over 50,000 hours playing the game before it learned to successfully navigate it. One challenge was that the AI preferred to admire the pixelated scenery instead of actually playing the game.
AI-powered “reenactments” of Twitch Plays Pokémon like Whidden’s and Anthropic’s are entertaining, but a little bittersweet at the same time. The original stream was such a pivotal moment in Twitch history because it brought people together in an unexpected way. Everyone was on the same team, working toward the goal of getting the player character to stop running in circles and actually progress through the game.
In 2025, it seems we’re no longer teammates, but spectators, watching an AI model try to play a game many of us got the hang of when we were five years old. It’s an AI-motivated microcosm of a larger trend: our experiences online are moving from shared, communal activities to more solitary ones.