
Continuing from Part 1 of this series today we will talk about repeated games and how this can influence the games particularly Prisoner’s Dilemma game discussed in Part 1.
This will be the agenda for today:
- Refresher of Prisoner’s dilemma game and the strategies
- Prisoner’s dilemma : When Finite number of games is played
- Prisoner’s dilemma : When Infinite number of games is played
- Payoff matrix in the two cases
- Game visualization using sparklines
Let us take the prisoner’s dilemma game.As a refresher, Table 1 shows the pay-off matrix. We are taking the default pay-off matrix as per the Axelrod package:

In our earlier discussion in Part 1, the prisoners had 2 strategies (Confess,Deny) and the Nash equilibrium of the game was (Confess,Confess). This will be the dominant strategy equilibrium.
Note that the strategy ‘Confess’ is also a sort of defecting strategy and the strategy ‘Deny’ means co-operation.
In the previous case, the players met only once and the game was played a single time.
Now, when the game is played multiple number of times then there is a range of other strategic possibilities available to each player. The kind of strategy available depends on whether the game is played for a finite/infinite number of times.Let us consider both the cases one by one.
Case 1: Game is played for a finite number of times (say 10)
Let us consider the last round i.e. Round 10, this is the last time the game will be played hence there is no point in co-operating and hence each player will defect i.e. Confess.
Now, let us consider round 9: We just saw that in round 10 players will defect, so why should they co-operate in round 9? If say Player A co-operates then player B can defect by exploiting player A’s good nature and get off scot free. Each player will argue in this way and they will each defect i.e. Confess. This logic can be applied to all the previous rounds.
Hence, if the game is played a finite/fixed number of times, then each player will defect at each round. Players cooperate, thinking this will induce cooperation in the future. But if there is no chance of future play there is no need to cooperate now.
Hence outcome will be (Confess, Confess) as before when the game was played only once.
Let us illustrate this in python now. We will use the Axelrod package here for repeated games.
Note that the cooperator strategy in Axelrod package refers to ‘C’ and defector strategy refers to ‘D’.
import axelrod as axl
players = (axl.Defector(), axl.Defector())
match1 = axl.Match(players, turns =10)
match1.play()
Output will be as follows:
[(D, D),
(D, D),
(D, D),
(D, D),
(D, D),
(D, D),
(D, D),
(D, D),
(D, D),
(D, D)]
As you can see the players will be defecting in all the rounds in a finite game.
Case 2: Game is played for an infinite number of times
The strategies change when the game is played an infinite number of times. If player A refuses to cooperate in one round then Player B can refuse to cooperate in the next round. The threat of non-cooperation will make the players resort to cooperation and play the Pareto efficient strategy i.e. Deny.
Robert Axelrod has demonstrated this with a series of experiments where different strategies was pitted against players in a tournament.
The winning strategy/combination turned out to be "Tit-for-Tat" because this gives the highest payoff to the players.
In the first round, Player A cooperates and plays the "Deny" strategy. On every round after this, Player A cooperates if Player B cooperated in the previous round and defects if Player B defected in the previous round. Thus Tit-for-Tat strategy means Player A needs to do whatever Player B did in the previous round.
This strategy works because it gives immediate punishment for defection and immediate reward for cooperation. This can give rise to the Pareto efficient outcome of (Deny, Deny) ultimately.
Let us illustrate this in Python now:
For the purpose of illustration let us play the game 20 times. Player A plays ‘Tit-for-tat" strategy and let Player B play a random strategy.
players = (axl.TitForTat(), axl.Random())
match2 = axl.Match(players, turns =20)
match2.play()
Let us now analyze the output as given below:
[(C, D),
(D, D),
(D, D),
(D, D),
(D, C),
(C, C),
(C, D),
(D, D),
(D, C),
(C, C),
(C, D),
(D, C),
(C, D),
(D, C),
(C, C),
(C, D),
(D, C),
(C, C),
(C, C),
(C, C)]
As you can see, in the first round Player A cooperates and plays ‘Deny’ but in round 2, Player A defects and plays ‘Confess’ because player B had defected in round 1. In the subsequent rounds, player A rewards by playing ‘Deny’ if B had denied earlier and punishes Player B by ‘Confessing’ if he had confessed earlier.
As you can see, the game converges from the 18th round when both the players start cooperating and plays (Deny, Deny) and the pareto efficient outcome is reached.
Payoffs and Game Visualization
Now, we will look at the pay-offs of the game and also visualize the game using Sparklines.
The default payoffs in the Axelrod library is as follows:

where
- R: the Reward payoff (default value in the library: 3)
- P: the Punishment payoff (default value in the library: 1)
- S: the Sucker/Loss payoff (default value in the library: 0)
- T: the Temptation payoff (default value in the library: 5)
This would translate to the following table:

Case 1 Payoff:
Let us calculate the payoff for Case 1
match1.game
# Scores of a match
match1.scores()
The output will be:
[(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1),
(1, 1)]
Since players defect always payoff will be (1,1) i.e. 1 to each player.
The result of the match can also be viewed as sparklines where cooperation is shown as a solid block and defection as a space.
First row is for player A and second row is for player B.
print(match1.sparklines())
In this case the output is blank/space because, the players always defect.Check this out in python.
Case 2 Payoff:
Similarly the payoff for Case 2 is given by:
match2.scores()
Output displayed will be:
[(0, 5),
(1, 1),
(1, 1),
(1, 1),
(5, 0),
(3, 3),
(0, 5),
(1, 1),
(5, 0),
(3, 3),
(0, 5),
(5, 0),
(0, 5),
(5, 0),
(3, 3),
(0, 5),
(5, 0),
(3, 3),
(3, 3),
(3, 3)]
You can see the payoff converging to (3,3) .
Now we can visualize this output using Sparklines as earlier.
print(match2.sparklines())
█ ██ ██ █ ██ ███
██ ██ █ ██ ████
Summary:
We discussed today how the outcome of the prisoner’s dilemma game changes with repeated games and how the outcomes differ if finite number of games is played vs infinite number of games.
The full code can be accessed here on Github.
I can be reached on Medium, LinkedIn or Twitter.
P.S.: Look at the output below and let me know in the comments what kind of strategy was followed here.
███████████████
█ █ █ █ █ █ █ █
References:
[1] John Nash, THE WORK OF JOHN NASH IN GAME THEORY (December 8, 1994), Nobel Seminar,nobelprize.org
[2] Gholamreza Askari, Madjid Eshaghi Gordji & Choonkil Park, The behavioral model and game theory (2019), Nature
[3] Shmuel Zamir, International Journal of Game Theory (2019), Springer
[4] Nashpy documentation (2017)