4 Milestones of A.I. Beating Humans at Our Own Games
Don’t be a sore loser because when you play against A.I., you probably won’t win. The technology has come a long way since machines notched their first big win against humans in chess 26 years ago. Now, A.I. can learn just about any game from scratch, without being taught the rules, and reach world championship level skill in a matter of hours.
Not long ago, Google’s DeepMind A.I. lab proved that there is no game designed by humans that A.I. cannot dominate. A tailor-made model won 99.8 percent of human opponents in the fiercely competitive sci-fi war strategy game, Starcraft. Just three years earlier, professional players and researchers were skeptical that an A.I. could achieve the strategic planning and other skills necessary to beat humans at one of the most fast-paced, high-stakes, unforgiving games ever invented.
It was only the latest example of a machine demonstrating the advanced cognitive abilities once thought of as exclusive to humans. In countless challenges over the years, A.I. researchers have shown that computers can outwit even the savviest players of the most complicated games. And behind the stories of the games themselves, lies a fascinating tale about how A.I. research has evolved and how the science behind A.I. is going backward in some ways.
A Tangible Win for Symbolic A.I.
These days, deep-learning neural networks designed to emulate the human brain are the secret sauce in how generative A.I. models process information.
Back in 1997 though, machine learning scientists were focused on developing symbolic A.I., under the hypothesis that a machine could produce intelligent behavior by processing and manipulating symbols of real concepts according to a set of rules.
Symbolic A.I. was the operating theory behind IBM’s Deep Blue, the first computer to beat a world chess champion in official tournament-style play.
Deep Blue is a classic example of an expert system, a computer programmed with enough knowledge and logic to approximate expertise. The system relies on four subsystems: knowledge representation, rule-based decisions, search algorithms, and evaluation functions. Deep Blue possessed a massive database of previous chess matches and positions, translated into machine-readable symbols that it could search and evaluate according to preprogrammed rules and strategies of chess. Deep Blue could evaluate 200 million positions per second and plan out its game as far as 20 moves in advance.
The chess-playing machine was far from perfect, both at playing chess and machine functions. Deep Blue lost its first match against world chess champion Garry Kasparov when they first played in 1996 because the machine did not have enough processing power. In interviews after his historic loss in a rematch the following year, Kasparov criticized both his play and the machine’s. Chess professionals pointed out that both players missed obvious opportunities to win or gain the advantage over their opponent. A.I. chess engines can now run on a phone instead of a computer tower and beat Deep Blue with ease.
In the end, Deep Blue fueled hopes of one day achieving artificial general intelligence and also displayed the limits of symbolic A.I. in achieving true comprehension.
Watson Beats Jeopardy!
Ten years after Deep Blue vs. Kasparov, IBM was looking for its next challenge. Algorithms for natural language processing (NLP) had come a long way by 2006 and Jeopardy! star Ken Jennings was a cultural phenomenon as he crushed records on his 74-day winning streak.
IBM decided to hop on the culture wave, building a two-dozen-member team to design a program that could beat the iconic quiz show. Over four years of trial and error, the team put together a machine that combined multiple A.I. methodologies into one system that could parse natural language, search its database for relevant answers, evaluate those answers based on scoring algorithms, and rank them on a confidence scale.
In February 2011, the world watched IBM’s Watson trounce its opponents (the other contestant, Brad Rutter, was also a former champion), racking up over $77,000 over three days of competition. Watson’s success in parsing and responding in natural language immediately stoked hopes and fears of the imminent arrival of viable artificial general intelligence. However, Watson’s purpose-built ability to answer trivia questions turned out not to be transferable to other applications. IBM tried for years to generate revenue by selling Watson as a medical assistant for cancer diagnosis and treatment plans, but poor data quality and high variability in medical diagnoses severely limited Watson’s success.
While Watson represented a big step forward in natural language processing, and contributed to the science behind important advancements like Apple’s Siri, Watson’s logic and rules-based processing methods fell short of the expectations for AGI.
DeepMind Cracks Go
Four years later, computer scientists at Google’s DeepMind lab proved the potential of deep learning neural networks by designing a system that beat humans at one of the most challenging board games ever invented: Go.
Go is an ancient game played by millions in Asia. In order to win, a player has to move their pieces strategically to capture more territory on the board than their opponent, using a mix of offensive and defensive maneuvers. A Go board is 19×19 spaces, meaning that there are 361 possible opening moves, compared to 20 opening moves possible in chess. It’s estimated that the number of unique board configurations possible in Go exceeds the number of atoms in the observable universe. So unlike chess, which a computer can win through brute force by examining all possible board configurations at every juncture of the game, the processing power necessary to brute force a game of Go is unfeasible. Go players say they rely on intuition as much as strategy, as moving pieces can have consequences that aren’t apparent until much later in the game.
For those reasons, many thought it would take a long time for a machine to learn the game well enough to be competitive. In 2016, however, DeepMind’s AlphaGo program beat world-champion Lee Sedol by three games in a four-game series. To Sedol’s credit, he was the only human ever to defeat AlphaGo in a game in the program’s 74-game career.
DeepMind continued improving on AlphaGo after the initial success of the Sedol match. Unlike the original AlphaGo, which learned the game from watching countless professional matches, successive iterations could develop strategies purely by playing against itself, and later could learn the rules without ever being trained on a game played by humans. The most recent iteration, AlphaGo Zero, used the same learning method to teach itself chess, shogi, and Go completely from scratch.
A.I. vs eSport DOTA 2
The same year that DeepMind’s AlphaGo beat Lee Sedol, OpenAI (pre-ChatGPT) decided to take on an A.I. gaming challenge of their own. They gave themselves exactly one year to train an A.I. that could beat humans at a fast-paced online computer game, as memorialized in the film Artificial Gamer.
The OpenAI team chose the player vs. player battle game, Dota 2, as their testing ground due to the game’s popularity, complexity, and ease of integration with an A.I. agent via the game’s application programming interface (API).
Dota 2 is a live online game where two teams, each consisting of five players, duke it out to be the last team standing. Each player chooses one of 124 hero characters that each have unique abilities to deal damage or support their teammates, leading to significant variability in how each game plays out. Each team starts at their own base and has only limited visibility of the map throughout the game, meaning in-game decisions must be made with incomplete information.
OpenAI’s first success came in 2017 when their A.I. agent beat a professional Dota 2 player in a live one-on-one match. The agent had been trained through reinforcement learning, playing against itself for two weeks in real time. The agent improved gradually through a system that rewarded it every time it eliminated another player.
By June 2018, OpenAI had a team of five bots that could work together to play a full-fledged game against human teams. The “OpenAI Five” (as the bots were called) lost both matches against two professional teams at the game’s international championship event. However, in 2019, the OpenAI Five came back and beat the 2018 championship team that they had lost to the year prior.
The OpenAI Five’s success at Dota 2 was a watershed moment for artificial intelligence and the power of reinforcement learning. OpenAI used the same technology to train a robotic hand called Dactyl, which solved a Rubik’s Cube by itself in 2019.
OpenAI has said that its goal with both projects is to explore how A.I. can become a more general purpose tool, for complicated tasks such as surgery.
Best of Both Worlds: Neuro-Symbolic A.I.
Despite the shortcomings of AlphaGo’s competitive predecessors, A.I. technology today still relies on some of the same principles that informed Deep Blue’s design.
AlphaGo’s ability to learn would not have been possible without neural networks, but it uses a symbolic approach called Monte Carlo Tree Search to assess the outcomes of multiple moves and pick the best one. The combination of neural networks, symbolic reasoning, and natural language processing formed the foundation for recent breakthroughs in large language models.
As these models continue to evolve beyond the challenges of traditional gaming, OpenAI’s CEO Sam Altman has predicted that the next big breakthrough for A.I. will be winning humans’ hearts and minds through persuasion and manipulation.