Credit: ClaudePlaysPokemon Elevator Shanty by Kurukkoo
Disclaimer: like some previous posts in this series, this was not primarily written by me, but by a friend. I did substantial editing, however.
ClaudePlaysPokemon feat. Opus 4.7 has finally beaten Pokémon Red, fulfilling the challenge set over a year ago when LLMs playing Pokémon went briefly, slightly viral.
Victory Screen!
Let's get the throat-clearing out of the way: this doesn't make 4.7 a clear breakthrough in intelligence over 4.6 or 4.5. It's smarter, yes, as we'll discuss below, but not by something one could honestly call a big leap. Rather, step changes have finally accumulated to the point of victory.
And to give other models their fair shake: after criticism over its elaborate harness,[1] GeminiPlaysPokemon has beaten Pokémon with progressively weaker harnesses, including about two months ago with a harness comparable to the one Claude uses.[2]
As such, this is a bit of a valedictory post, closing off the cycle of Claude playing Pokémon Red, relating anecdotes for the fun of it, and discussing improvements in Opus 4.7, as well as speculating a bit on what this has all meant.
Retrospective Anecdotes on Claude 4.5 and 4.6
Our last post, on Opus [...]
---
Outline:
(01:37) Retrospective Anecdotes on Claude 4.5 and 4.6
[... 10 more sections]
---
First published:
May 16th, 2026
Source:
https://www.lesswrong.com/posts/sehJYg5Yny9fvpbpt/a-year-late-claude-finally-beats-pokemon
---
Narrated by TYPE III AUDIO.
---
Images from the article: