The Practical Value of Game AI

What do we gain by automating a pastime?

Published in

Towards Data Science

10 min readJul 10, 2019

It is a bad idea to intuit how broadly intelligent a machine must be, or have the capacity to be, based solely on a single task. The checkers-playing machines of the 1950s amazed researchers and many considered these a huge leap towards human-level reasoning, yet we now appreciate that achieving human or superhuman performance in this game is far easier than achieving human-level general intelligence. In fact, even the best humans can easily be defeated by a search algorithm with simple heuristics. Human or superhuman performance in one task is not necessarily a stepping-stone towards near-human performance across most tasks.
— Luke Hewitt

What is it about board games and video games that attracts artificial intelligence research so much? It started with the checker-playing algorithms in the 1950’s, where researchers were amazed at the “thinking” the checkers-playing algorithms exhibited. That was followed by chess which became a focal point of AI research all the way to the 2000’s. Fast-forwarding to 2015, a viral video of a neural network playing Super Mario increased mainstream interest in video game AI, and made its way beyond the video game developers’ niche and into mainstream data science banter.

Just this week, Facebook contributed to the ongoing list of poker AI algorithms.

When DeepMind entered the picture, things got more interesting. After getting acquired by Google in 2014, the media increasingly brought attention to the company and its artificial intelligence applications. First there was the famous AlphaGo algorithm, which beat world champions at the ancient board game Go in 2016 (there is also a documentary on Netflix). Then AlphaZero revived interest in chess algorithm research, using a deep-learning approach paired with Monte Carlo techniques, instead of the alpha-beta pruning algorithm used by IBM in the 1990’s as well as Stockfish in the 2000’s.

DeepMind then turned its attention away from board games and onto video games, like StarCraft II and Defense of the Ancients 2 (DOTA 2). At this point, it became obvious a pattern was taking place.

DeepMind is spending an awful lot of resources and time automating a recreational activity, and history tells us this rarely translates into practical real-world value. If that was the case, common sense would argue hardcore video gamers would be in high-demand for their strategic brilliance. If an algorithm is valuable because it wins StarCraft, should not a human video gamer be just as valuable to strategic functions in corporations and the military?

“I’m not sure the ideas in AlphaZero generalize readily. Games are a very unusual thing.” — Pedro Domingos

I know what some of you readers are thinking. A gamer is a human, and if an algorithm can play a game as well as (or better than) a human, then it has replicated human intelligence. This is a fallacious way to think, because it is likely an algorithm can solve 24653546734 + 5345434534 quicker than you but that does not mean it replicated or outperformed human intelligence. Just because an algorithm has been optimized to do one task (e.g. playing StarCraft) does not mean it can be optimized to do any task. Without explicit heuristics and hard-coding, algorithms fail when venturing outside a single narrowly-defined task.

Another countering sentiment is the objective is not to solve the game as efficiently as possible, but to have it “learn” how to solve the game without explicit guidance and heuristics. I understand the objective here, but I think it is marginalized by the fact it is only being trained how to do one task and doing it in a brute-force way (more on that later).

Gaming seems to be the main emphasis and focus at DeepMind. If you look at a public list of their projects, a great majority of them are game-related. Why is that? And what is the point of running massive computations and thousands of years worth of gameplay… only to beat a hardcore gamer who can master a game in a matter of weeks and on far less data?

Even more, heuristics can create a decent AI and do it much more cheaply. We all know the objective is to have a machine “learn” to do a task without being explicitly coded for it, but is it not ironic we train and train and train just so it learns one task before even executing it, resulting in a slow and inefficient implementation? Meanwhile, old school heuristics will do it immediately and effectively by skipping the learning part.

“Most real-world strategic interactions involve hidden information. I feel like that’s been neglected by the majority of the AI community.” — Noam Brown, AI Research Scientist at Facebook

This gaming fixation with artificial intelligence research is hard to ignore, and I think there is a need to explore why. There are three main advantages games have in AI research, which we will cover:

Games are a completely self-contained problem where all possible events, variables, and outcomes are known.
Data can be generated in games through randomized gameplay.
Games can have deterministic outcomes due to predictable and controlled environments.

When Games Capture Real-World Problems

I have to be fair. DeepMind has done notable work towards protein-folding with AlphaFold, and recently got some recognition for its contributions. There have been other projects that found industry applications too. So DeepMind has done more than expensively replace gamers.

I will also add when you look beyond deep learning and consider other “AI” algorithms, there are definitely some solution overlaps between games and practical problems. This is especially the case when you look at operations research. For example, using a tree search/linear programming algorithm to solve a Sudoku can also be formulated to solve physical constraint problems like scheduling. I talk about this in a separate article called Sudokus and Schedules, and also cover in a video below:

This same tree search approach can be adapted into an alpha-beta pruning algorithm to win chess and other adversarial turn-based games. This was actually the algorithm used in IBM’s Deep Blue algorithm in the 1990’s, as well as Stockfish in the 2000’s.

You can create game-like Monte Carlo simulations and call it “AI” as well. For those of you unfamiliar, Monte Carlo algorithms utilize randomness to achieve an objective. For example, if you take some simple random distributions describing how long it takes to process a customer (a normal distribution) as well as how frequently a customer walks in (a Poisson distribution), you can create customer queue simulations like the one below:

So there are places where board games and video games have overlapped with practical real-world problems. And sure, you can use neural networks to try and solve all these problems, but in practicality why would you want to when incumbent algorithms will do a much better job, and with much less expense?

There is a point where it feels like we are building AI against games for the sake of, which is fine and that is the prerogative of research. However it is perplexing when the creators of these algorithms allege there is an untapped potential for these algorithms to solve real-world problems at an extraordinary AGI scale, while staying stuck in a loop of finding the next game to automate rather than tackling an industrial problem.

When Games Do Not Capture the Real World

Back in the 1990’s there was a lot of attention on IBM’s Deep Blue, a chess-playing algorithm that used alpha-beta pruning (a form of tree search). Unfortunately, this chess algorithm never found a significant use case in the real world, despite hype and anthropomorphizing language by human players and the media. In reality, alpha-beta pruning was nothing more than a well-engineered search algorithm that was only good for chess and other turn-based games.

Today, AlphaZero made a lot of headlines in late 2018, with remarkably identical reactions to those of Deep Blue in 1996. There was one notable article which I have linked below:

DeepMind's AlphaZero now showing human-like intuition in historical 'turning point' for AI

DeepMind's artificial intelligence programme AlphaZero is now showing signs of human-like intuition and creativity, in…

news.yahoo.com

Note carefully the choice of words in this article, which anthropomorphize the algorithm with words like “human-like”, “creativity”, and “intuition”. Can we be real here? This is just a better chess algorithm using fitted randomized data instead of a tree search, and humanizing words are used to make the algorithm sound like a human rather than a calculator.

I thought it was pretty strange that this article glossed over the massive Monte Carlo generation of data used for training, where the algorithm plays countless random games against itself, and then a regression is performed on that data to estimate an optimal move on a given turn. However, the article marginalized incumbent algorithms like Stockfish for “calculating millions of possible outcomes as it plays” and being computationally expensive. Is this not the pot calling the kettle black? Both Stockfish and AlphaZero require heavy computation and generate large numbers of outcomes, and an argument can be made that AlphaZero requires much more.

I will admit, the computation methods and their stages of training are different. But I think this article is extremely misleading for critiquing incumbent algorithms for requiring large computations while AlphaZero also did this. AlphaZero, like all of DeepMind’s gaming-related AI projects, generated data by playing random games with itself, which rarely is possible in the real-world. This is why so many data scientists blame deep learning models not working on “not having enough data”. When you have to rely on an enormous amount of data like that, an argument can be made we should be focusing on using less data, not more.

And we do this for what end? To create a better chess algorithm with a massive data generation/training overhead? That’s fine, it really is an accomplishment for chess research and knowledge. But let’s not kid ourselves and start saying SkyNet is now possible, contingent we have a faucet to give us unlimited labeled data to train with.

AlphaZero, like all of DeepMind’s gaming-related AI projects, generated data by playing random games with itself, which you cannot do in the real-world.

Why Game AI Fails in the Real World

Common sense can point to three reasons why game AI struggles to find utility in the real world:

Games are a completely self-contained problem where all possible events, variables, and outcomes are known. In the real-world, uncertainty and unknowns are everywhere and ambiguity is the norm.
Data can be generated in games through randomized gameplay, but this cannot be done for most real-world problems. You can generate data with simulations (like the customer queue example above), but the data is only as good as the simulation which likely already has predictive value.
Games can have deterministic outcomes and have all necessary information (other than what the adversarial player will do next), whereas real-world problems can be highly nondeterministic and have limited partial information.

It is for these reasons that games like Go, Chess, StarCraft, and DOTA 2 are easy to build AI for and yet difficult to utilize in the real world. On top of that, games have room for error and poor moves which can easily go unnoticed. In real world applications there is a lot less tolerance for error unless the application is uncritical, like pushing ads or social media posts. And again, the real-world is often going to prefer heuristics rather than experimental deep learning that has struggled to be logistically practical.

It is important to not be victim to the fallacy of composition, where we are quick to generalize because of a small success and mistakenly prescribe the solution to the bigger problem. The data-centered approach is hitting limitations, and perhaps we should be finding AI models that use less data rather than demand more data. Joseph Sirosh, corporate VP for AI and research at Microsoft, puts it best:

“If you’re in an environment where there is unlimited data available to learn, then you can be incredibly great at it, and there are many, many ways you can be great at it. The smarts about AI comes when you have limited data. Human beings like you and me, we actually learn with very limited data, we learn new skills with one-shot guidance. That’s really where AI needs to get to. That’s the challenge. We are working towards enabling true AI.”

From another perspective, one really should consider the P versus NP problem. I am surprised contemporary AI literature seems to eschew this topic, because it really is the key to truly unlocking effective AI. I highly recommend watching this video, and it is worth the 10 minutes.

Although it has been neither proven or disproven, more scientists are coming to believe that P does not equal NP. This is extremely inconvenient for AI research because it means complexity is always going to limit what we can do. I sometimes wonder if all these data-driven AI models of today are a frustrated attempt to move away from heuristics and try to work around the P versus NP problem. The irony is the process of optimizing loss in machine learning is still very much in the P versus NP problem space, and is one of the major reasons why machine learning is so hard.

Despite all of these limitations, if DeepMind still insists on pushing Deep Learning, they could at least start applying it to other domains. I would love it if DeepMind tackled the Traveling Salesman Problem and other industrial problems with deep learning (as done in this paper), rather than staying stuck in the domain of video games and safe problems. The AI research on games is cool and educational, but it would be nice to see some variety mixing fun problems with real-world difficult problems industries are faced with every day. There should be more things like protein-folding, and less video gaming.

Then again, it is likely real-world problems are not as sexy. Can you really use the Traveling Salesman Problem as a publicity stunt? Or is it cooler to have the algorithm win an adversarial match against a world champion of [put game here]? I guarantee you, the latter is more likely to make headlines and bring in the VC funding.

Further Reading:

Why AlphaZero's Artificial Intelligence Has Trouble With the Real World | Quanta Magazine

Until very recently, the machines that could trounce champions were at least respectful enough to start by learning…

www.quantamagazine.org

AI: How big a deal is Google's latest AlphaGo breakthrough?

What is the significance of the extraordinary success of AlphaGo at the game of Go and how does it advance the…

www.techrepublic.com

The Future of AI Will Be About Less Data, Not More

Executive Summary Companies considering how to invest in AI capabilities should first understand that over the coming…

hbr.org