I loved this video by Peter Whidden. It’s a primer on Reinforcement Learning—a branch of Machine Learning—but explained in a simple way using the classic Pokemon Red Gameboy game.
Reinforcement learning uses an AI-driven system to “learn through trial and error using feedback from its actions”.
In this video, Whidden has the AI play 20,000 games with over 5-years of simulated game time. The feedback that Whidden provides the AI is a reward for certain actions in the game, such as exploration, catching a Pokemon, or defeating an opponent. Because the inputs are simple: Up, Down, Left, Right, Button-A, and Button-B, the AI learns over time how to adjust it’s inputs to get to the reward faster.
What makes this video so excellent is Whidden’s editing. He visualises the journey of the 20,000 simulations in a way that makes it really easy to understand how reinforcement learning works.
Pokemon Red was my first ever video game, and it’s one that has a lot of personal nostalgia. It was awesome to see it used to explain reinforcement learning in a clear and simple way.