The world’s leading publication for data science, AI, and ML professionals.

Zero-sum Games and Mixed Strategies

Modelling strategic interactions demands we account for uncertainty

I’ve repeated many cliche’s about why I believe that game theory is essential to data science in my earlier articles. I’m just going to repeat my favorite: "game theory is like probability with incentives".

A few of my earlier stories on Game Theory can be found here if you’re interested:

  • Game Theory: The Basics

Game Theory

  • An introduction to the Nash Equilibrium

An Introduction to the Nash Equilibrium

  • Applications of the Nash Equilibrium

Applications of the Nash Equilibrium

  • Dominated Strategies in Game Theory Explained

Dominated Strategy in Game Theory Explained

What is a Mixed Strategy?

Let’s return to the reason that the Nash Equilibrium is such a useful solution concept: it will always give us an answer in every finite game. This is Nash’s existence theorem:

"Any game with a finite number of players and a finite number of actions has a mixed-strategy Nash equilibrium."

But what does "mixed" mean in this context? Basically, it means that in many scenarios, we need to introduce probability to our models of strategic interactions for those models to make sense.

The classic example used to illustrate the point is "matching pennies", which in turn is often recast in terms of a football game. Personally I prefer the football version, but here’s the link to an explanation about matching pennies if you’re interested:

Matching pennies – Wikipedia

Regardless of your preferred metaphor, both are zero-sum games because each players gain/loss of utility is exactly balanced by the other player’s change in utility. The sum of the change in utility for both players equals zero.

The Penalty Kick Game:

Photo by Sven Kucinic on Unsplash
Photo by Sven Kucinic on Unsplash

Penalty kicks occur in soccer (football) games as a punishment for a particular kind of foul, or as a way to resolve draws in games that must have a winner. During the penalty kick, the goal is defended only by the goal keeper. The kicker takes the shot from a designated point 12 yards from the goal line.

Payoffs are such that the kicker would like to make the shot, and the keeper would like to save it.

Crucially, the ball is shot so fast in a penalty kick that, if the keeper waits to see where it is going before beginning to move to make the save, they will never succeed. Therefore, the keeper must react to where they think the ball is going to go, making this effectively a simultaneous move game.

The kicker may shoot the ball in virtually infinite ways, and the keeper may jump to save it in virtually infinite ways. To model this situation as a game, we will restrict the action set to just two options: left or right. We will define "right" as "right from the kickers point of view" and left as "left from the kickers point of view". Under this framework, if the actions of the players match, then the keeper jumped in the correct direction and made the save. If the actions mismatch, the keeper jumped in the wrong direction and the kicker scored.

The payoff matrix:

The following is a payoff matrix consistent with this story. If we use our usual technique to look for a Nash equilibrium, we find the following:

image by author
image by author

It appears that no Strategy profile has the properties we need to call it a Nash equilibrium, because for each possible outcome, one player has incentive to unilaterally deviate.

And yet this is a finite game.

Doesn’t Nash’s existence theorem state that that we must be able to find an equilibrium here?

Mixed strategies:

The problem lies in the way I’ve defined strategies in my articles so far. I have assumed that players must choose an action "for sure", which is known as a pure strategy. Pure strategies are a convenient assumption for early introduction to game theory, but its an assumption that isn’t consistent with reality.

As we know, many things in life and data science are better described in probabilistic terms, and Game Theory is no exception. We need to begin to introduce probability to our Game Theory toolbox.

Players do not have to necessarily play pure strategies, and in fact they usually don’t. Players can randomize between their pure strategies with some probability distribution. Doing so is called playing a mixed strategy.

In the penalty kick game, what would a mixed strategy for the keeper look like? It would be a probability distribution that would specify with what probability the keeper plays "right" with probability p, which would then imply that the other action available ("left") is taken with probability 1-p. Similarly, for the kicker, a mixed strategy would be a probability (lets call it q) of playing "right" and a probability 1-q of playing "left".

Mixed strategies in the penalty kick game:

Redefining strategies in this broader way allows us to find the Nash equilibrium in the penalty kick game, and many other games.

Lets make the assumption that the probability distribution that best suits this scenario is a Bernoulli distribution. Under this assumption, the unique Nash equilibrium of the penalty kick game is (0.5R + 0.5L, 0.5R +0.5L), or in plain English, each player plays right and left with equal probability.

The intuition goes something like this:

  • if the keeper is more likely to jump right or more likely to jump left, then the kickers best response is to mismatch whatever the keeper is more likely to do: this way, they’ll score more than half the time.
  • Similarly, if the kicker is more likely to kick right or more likely to kick left, then the keepers best response is to match whatever the kicker is more likely to do: this way, they’ll make a save more than half the time.
  • But if your opponent is equally likely to choose right and left, the interaction will go your way exactly half the time no matter what you do. Since you are indifferent you may as well mix.

The following graph of the best response functions further illustrates this idea.

image by author
image by author

The effort monitoring game:

Photo by Wes Hicks on Unsplash
Photo by Wes Hicks on Unsplash

This is a strategic interaction between a worker and their employer.

  • The worker chooses to work hard or to shirk.
  • The employer simultaneously chooses whether to check on the worker or not.
  • The worker’s preferences are such that shirking is more pleasant than working, but not if they get caught doing so.
  • The employer’s preferences are such that they only want to check on the worker (and subsequently punish them) if they think the worker is shirking.
  • if instead, the employer thinks the worker is working, they prefer not to waste their time and antagonize their employee, and would, therefore, prefer not to check.

Following is a payoff matrix consistent with this story:

image by author
image by author

This game has no Nash equilibrium in pure strategies because, like the penalty kick game, both players would like their strategy to be "unexpected" by their counterpart. The worker only wants to work if they think they employer will check… but the employer only wants to check if they think the worker will shirk!

Let’s look for a Nash equilibrium in mixed strategies. If we call p the employers probability of checking, the worker is indifferent between working and not working if

If p > 1/3, the workers expected payoff of working is greater than their expected payoff of shirking, and if p <1/3, the workers expected payoff of shirking is greater than their expected payoff of working. When p=1/3, the worker is indifferent. Any strategy – including any mixed strategy – is the best response.

If we call q the workers probability of working, the worker is indifferent between shirking and working when:

  • If q>5/7, the employers expected payoff of not checking is greater than their expected payoff of checking,
  • if q>5/7, the employers expected payoff of checking is greater than their expected payoff of not doing so.
  • When q>5/7, the employers expected payoff of checking is greater than their expected payoff of not doing so.
  • When q=5/7, the employer is indifferent. Any strategy – including any mixed strategy – is the best response.

Following on from the above, the only set of mutual best responses is

Effort monitoring game: real-world implications

Take the original effort monitoring game and decrease the workers payoff in the outcome in which they shirk and are caught, then recalculate the mixed strategy equilibrium. You should notice that the workers mix doesn’t change but the employer now checks on them less and less frequently. This tells us that in games with this structure, a high probability of being caught and a high punishment when caught are substitutes.

In many real world situations, particularly in a societal context rather than a workplace, the traditional view is that increasing the probability of being caught can be more costly than increasing the punishment of being caught. This is touted as one of the reasons for rising increases in incarceration rates in some nations.

Increasing the former means increasing the manpower devoted to monitoring and policing (which can be described as a cost function), while increasing punishment has a different (traditionally lower) cost function. As a consequence, it is the conventional view that one can often achieve the same compliance rate by decreasing monitoring but increasing punishment.

This view is changing as the cost of monitoring decreases, while the cost of punishment in the form of incarceration increases. The introduction of machine learning, computer vision and wide-spread public surveillance are some of the technologies introduced to "predictive policing" that flip the cost benefit balance.

The Frenemy Game:

My daughter tells me that "Frenemies" are a thing. I confess that I don’t really understand, but whatever. As a game, it gives us 7 key insights into mixed Nash equilibria.

This story is a strategic interaction between you and your "frenemy". You are choosing between going to a party or going to the pub (nb: in Australia, the pub is like the local bar).

  • All else equal, you’d prefer the party,
  • But you’d also like to meet your frenemy, so if they go to the pub, you end up being indifferent between the party and the pub.
  • Your frenemy has the choice between party, pub or staying at home.
  • Staying at home is the worst possible option.
  • Your frenemy has no particular preference between party and pub – but your frenemy would like to avoid you, so prefers to go to the locale they believe you are not going to.
  • Following is a payoff matrix consistent with this story:
image by author
image by author

Using our usual techniques, we can find one Nash equilibrium in pure strategies in this game (Party, Pub). But this solution is not satisfying, as it fails to capture the uncertainty created by uncertainty. Specifically, when the optimal decision for each player is conditional on the decision of another player and vice versa, an element of uncertainty is introduced into the strategic interaction that demands we introduce probability to our game environment. That in turn demands mixed strategies.

Are there also any Nash equilibria in mixed strategies? Lets suppose you play a mix: "Party" with probability and "Pub" with probability p. If you use the technique we’ve followed so far, you’d next figure out what value of makes your frenemy indifferent between their three pure strategies.

If you did this correctly, you got a number for p that is not between 0 and 1 and therefore doesn’t make sense as a probability.

Whats going on?

The mathematics is showing you that there is no mix that you can play that will make your frenemy indifferent between all three of their pure strategies.

Why not?

Because "Home" is strictly dominated by "Party" & "Pub".

Just like players do not play strictly dominated strategies in a pure Nash equilibrium, they do not play strictly dominated dominated strategies in a mixed strategy Nash equilibrium.

That means we can remove the strategy "Home" from consideration when searching for ANY Nash equilibrium

Finding the set of Nash Equilibria:

Your frenemy potentially be made indifferent between their remaining strategies, "Party" and "Pub", and could, therefore, be induced to choose a mixed strategy themselves.

But suppose your frenemy mixes, i.e. chooses both party & pub with positive probability. What is your best response? (nb: the Greek Pi symbol is often used in economics to denote Economic Profit or payoff. Pi has this meaning in this article)

If we call your frenemy’s probability p of playing party:

  • your expected payoff of Party is:
  • your expected payoff of Pub is:

Therefore, your expected payoff of party is greater than your expected payoff of "pub" as long as p>0.

In other words, if your frenemy has any chance of going to the Party, you will prefer to go to the Party. This is because "Party" weakly dominates "Pub" for you. You’re only indifferent between these two strategies when your frenemy goes to the Pub for sure. Otherwise you’d prefer "Party".

So as long as your frenemy plays "Party" with p>0, you should go to the party. But then, what would your frenemy – who is trying to avoid you – do?

They would go to the pub!

So we have a contradiction. If your frenemy goes to the party with any positive probability, then you will go to the party for sure, which would make them prefer to go to the pub.

From this we can draw the following conclusion: your frenemy plays "Pub" for sure (with probability 1) in all Nash equilibria.

Seems like a conundrum doesn’t it?

Does this mean there are no equilibria in mixed strategies in this game?

Not necessarily!

It is possible to have a Nash equilibrium in which some players play mixed strategies, while others play pure strategies.

  • Your frenemy will play a pure strategy in all Nash equilibria.
  • That pure strategy (pub) makes you indifferent, which means that any strategy you choose is best response.
  • Since in a Nash equilibrium, we need a set of mutual best response with "Pub".

If we call your probability p of choosing party:

That is, as long as you are not more likely to play "pub" than "party", your frenemy’s best response is "pub".

In summary, the set of Nash equilibria of this game can be described as:

  • Your frenemy plays "Pub", if you play "Party" with any probability greater than or equal to ½, ie p>½.
  • Recall the wording "All else equal, you’d prefer the party". Therefore, it seems reasonable to assume that p>½.
  • Notice that this implies that there are infinite Nash equilibria in this game.
  • Notice also that the first Nash equilibrium we found – the one in pure strategies – belongs to this set.

In Summary – Mixed Strategies:

The frenemy game has begun to introduce probability and uncertainty to the strategic environment. We will build on this in future articles.

It has also illustrated 7 important facts about mixed strategy equilibria:

  • Nash equilibria in mixed strategies are still Nash equilibria – they must satisfy the same requirements as Nash equilibria in pure strategies.
  • Each player’s strategy is a best response to all other players strategies.
  • There is no incentive to deviate for any player.
  • A mixed strategy doesn’t have to assign positive probability to all pure strategies
  • A strictly dominated strategy will never be played in any Nash equilibrium, including one withmixed strategies.
  • Not all Players necessarily mix in a mixed strategy Nash equilibrium – some could be playing pure strategies while others are mixing.
  • It can be said that all Nash equilibria are Nash equilibria in mixed strategies – pure strategy Nash equilibria are just a special case in which all players assign probability 1 to one strategy and 0 to all others

Related Articles