Hands-on Tutorials

Like many early 2000s kids, I enjoyed playing Pokemon growing up. Catching those animal-like creatures, and training them to fight in battle against other Pokemon formed some of my fondest childhood memories. With the upcoming remakes of the classic Pokemon Diamond and Pearl video games, Brilliant Diamond and Shining Pearl, coming out next month, I thought it would be fun to use my data science knowledge to determine the best team of Pokemon to use in these games.
Primer
First, a bit of background for those not familiar with the Pokemon franchise. Pokemon are animal-like creatures that can be captured and trained to battle against other Pokemon. At the start of each game, you’re given a choice between three initial Pokemon, known as starter Pokemon. In the Diamond and Pearl games, these are Turtwig, a grass-type turtle Pokemon; Chimchar, a fire-type chimp Pokemon; and Piplup, a water-type penguin Pokemon. They and many other Pokemon can undergo a metamorphosis called evolution up to two times, allowing them to get stronger and in some cases gain an additional typing.

The objective of most games is to train a team of up to six Pokemon and become the strongest trainer in the region. To achieve this goal, you must battle a series of eight bosses called gym leaders who each specialize in a specific element, or type, of Pokemon. For example, Pikachu, the Pokemon mascot, is an electric type Pokemon. This means it has a natural advantage against water type Pokemon, but a weakness to ground types. To gain the upper hand, you want to use Pokemon strong against the type you’re facing (think of it as an advanced version of rock, paper, scissors). After you defeat all eight gym leaders, you can then face off against the Elite Four at the Pokemon League. These are considered some of the strongest trainers in the land and like gym leaders, specialize in a specific type of Pokemon. Following the Elite Four is the final boss fight against the Pokemon Champion, who in Diamond and Pearl is Cynthia: she uses a myriad of different Pokemon types and is regarded as one of the hardest boss fights in the entire series. There are 18 types and you can use up to 6 Pokemon at a time. In this case, what is the best team of Pokemon you can assemble to take on these 12 trainers prior to Cynthia?
Dataset
I will be using two datasets, both from Kaggle. One is a listing of every Pokemon as of 2013 (which is completely fine for our purposes as the original Diamond and Pearl games were released in 2006), and another is a matrix of the different type matchups. We will read in our data using pandas in Python:
Let’s take a look at the first few entries:
poke_df.head()

Each entry gives a Pokemon’s name, its number in the national index of Pokemon (known as the Pokedex in the games), its primary type (type 1) and any secondary typing (type 2, where NaN is used if there’s no secondary type), total stats (Total) and individual breakdown of each stat (overall, these are metrics that indicate how strong the Pokemon), what generation it was introduced in, and whether it’s a legendary Pokemon. For those not familiar with the nomenclature, Generation refers to which set of games a Pokemon made its debut. For example, Pikachu first appeared in the original Pokemon games: Pokemon Red, Blue, and Yellow. Hence, it is considered a generation 1 Pokemon. Pokemon Diamond and Pearl (and an updated version, Platinum) represent the fourth set of games released in the franchise, so any Pokemon introduced therein is considered a generation 4 Pokemon. The current cycle of games is generation 8, so the upcoming remakes of Diamond and Pearl will be a part of that generation (for the duration of this article, I will use the abbreviation DPPt when referring to the original games, and BDSP for the remakes). The other column, Legendary, represents rare and very powerful Pokemon with in-game mythology that are typically caught towards the end of a game or after the main storyline has been completed. These Pokemon generally have higher stats, automatically conferring an unfair advantage to any trainer who wields them. For sake of balance, we will exclude these Pokemon from our analysis. You may have also noticed that the Pokemon Venusaur has a secondary entry, Mega Venusaur. This is a part of a game mechanic called Mega Evolution that was introduced in the generation 6 games. It’s not present in BDSP, so those entries will also be ignored.
We want to filter Pokemon that only exist in the DPPt Pokedex (and what I assume will still be present in the remakes) during the main storyline. This includes the 107 Pokemon introduced in generation 4, as well as 110 other Pokemon from the first three generations of Pokemon, which totals to 217.
We’ll first reformat some of the column names to make them easier to index, such as renaming the # column to the common abbreviation for number, No.
poke_df = poke_df.rename(columns={"#": "No"})
poke_df.head()

Next we will store the indices of Pokemon introduced in generation 4, which are numbered 387 through 493, in a list
# Note: Sinnoh is the fictional region these games take place in
sinnoh = list(range(387,494))
We will similarly store the indices of Pokemon from previous generations that are available in the game. Unfortunately, there was no easy way to extract the indices of those Pokemon from the dataset or any other datasets I looked through, so I manually recorded the individual indices from a reputable database of Pokemon in the Sinnoh Pokedex:
sinnoh_expat = [63, 64, 65, 129, 130, 315, 41, 42, 169, 74, 75, 76,
95, 208, 66, 67, 68, 54, 55, 265, 266, 267, 268, 269,
214, 190, 92, 93, 94, 200, 198, 118, 119, 339, 340,
358, 307, 308, 77, 78, 185, 122, 113, 242, 173, 35, 36,
172, 25, 26, 163, 164, 143, 201, 194, 195, 278, 279, 203,
298, 183, 184, 223, 224, 72, 73, 349, 350, 226, 215, 207,
299, 280, 281, 282, 108, 133, 134, 135, 136, 196, 197,
333, 334, 175, 176, 228, 229, 81, 82, 114, 193, 357, 111,
112, 355, 356, 137, 233, 123, 212, 239, 125, 240, 126,
220, 221, 361, 362, 359]
Quickly confirming the expected number of Pokemon with len(sinnoh_expat)
, which comes out to 110
, we concatenate these lists and filter our Pokedex accordingly:

We see that we have some duplicates present (e.g., entry № 492, Shaymin), which we can remove using the drop_duplicates()
function (this also removes any additional forms such as Mega Evolutions):
sinnoh_dex = sinnoh_df.drop_duplicates(subset=['No'])
sinnoh_dex

Awesome! We’ve got all 217 Pokemon from the Sinnoh Pokedex. Our data is almost ready for analysis. Now we just need to remove all the legendary Pokemon. Thankfully, since we have a boolean variable for that category, it’s a very simple filtering procedure.
sinnoh_dex = sinnoh_dex[sinnoh_dex.Legendary == False]
sinnoh_dex

Now that our data is ready for analysis, we need to prepare our type chart matrix. This is the other dataset I downloaded that shows the match-up of different types:

To walk you through this chart, the rows represent the Pokemon type we’re using, and the column is the type on the opposing team. A 2 indicates a Pokemon of a type along the rows has an advantage against the opposing Pokemon of the type in the column. For example, suppose Pikachu, an electric type, launches a thunderbolt against a Pokemon. If it attacks a water type Pokemon with that move, it’s super effective, so we would quantify that effect with a 2; by contrast, if it attacked a Dragon type Pokemon, it would not be very effective, so the baseline effect would be halved, and thus, we have a 0.5 in that entry in the matrix. Moves that do neutral damage have 1 (e.g., Pikachu attacking a fire type Pokemon). One thing to point out for electric types (which applies to a few others as well) is the 0 under the Ground type column. This means that Electric type moves have no effect whatsoever against ground type Pokemon; the latter are immune to them.
Regarding the gym leaders and Elite Four, there are 12 types we want to select against: rock, grass, fighting, water, ghost, steel, ice, electric, bug, ground, fire, and psychic, so we will filter the columns of our type chart accordingly:

Revisiting our objective, we want a team of 6 Pokemon whose types are strong against the gym leaders and the Elite Four. We will also impose a constraint that we must have one starter Pokemon on our team (the others you don’t choose are unobtainable in the main storyline). With this pursuit of the best team under all these constraints, you may recognize this as an optimization problem. In these problems, we have an objective function that we want to optimize. In this case, we want to maximize the expression

under the following constraints:

where T is the total stats (column Total in the first dataset) of Pokemon i, P is a decision variable indicating whether that Pokemon is on my team, S is the set of starter Pokemon (including their evolutions), and A is a binary matrix indicating whether a Pokemon i has an advantage against an opposing Pokemon j (1 if yes, 0 otherwise).
To solve this optimization problem, we will use the pulp
package, which is an intuitive library for constructing and solving these mathematical models using linear programming. It can be installed via conda install pulp
.
First, we will create our vectors and matrices to store the above variables.
Now, we can create our model.
We can now pull out the resulting six Pokemon:
Infernape
Garchomp
Electivire
Dusknoir
Cresselia
Manaphy
So our optimal team is Infernape (fire/fighting), Garchomp (dragon/ground), Electivire (electric), Dusknoir (ghost), Cresselia (psychic), and Manaphy (water). A team with diverse typings. However, the keen Pokemon enthusiast may notice that Cresselia is actually a legendary Pokemon, and Manaphy is a mythical Pokemon. The latter category is essentially the same as a legendary in terms of stats and in-game lore, but they’re historically more difficult to obtain (they are typically offered through exclusive events). Needless to say, with their higher stats, they give us an unfair advantage, but we can resolve this by adding another constraint equation that restricts their usage:

where C and M refer to Cresselia and Manaphy, respectively. We can then modify our code to include this restriction as follows:
Gyarados
Umbreon
Infernape
Garchomp
Electivire
Togekiss
Now we have a more balanced team in terms of both stats and types with Gyarados (water/flying), Umbreon (dark), Infernape (fire/fighting), Garchomp (dragon/ground), Electivire (electric), and Togekiss (fairy/flying).

One common Pokemon here is Infernape, the evolved form of Chimchar, one of the starter Pokemon you can choose from. Let’s suppose you have an affinity for the penguin Pokemon, Piplup, however, and want to build a team around its evolved form, Empoleon. How would you adjust the algorithm?
We can restrict the set of starter Pokemon to just include the Piplup family like so

where Penguin is the set of Piplup and its evolutionary family (Prinplup and Empoleon). We can code this as follows:
Snorlax
Empoleon
Garchomp
Electivire
Togekiss
Dusknoir

We can similarly adjust this for a team around Turtwig, the grass starter:
Snorlax
Torterra
Garchomp
Magmortar
Togekiss
Dusknoir

Conclusion
In this article, I showed you how you can use Linear Programming to determine the optimal team to use in Pokemon Brilliant Diamond and Shining Pearl (assuming there are no deviations from the original Pokedex). We can further optimize this for other desired customizations (for example, additional Pokemon you may want to use, like Pikachu; optimize for certain stats like attack and speed instead of total stats, etc). You can also adapt this to another Pokemon game such as the most recent main series games, Pokemon Sword and Shield. Happy coding!
References:
[1]https://www.serebii.net/pokedex-dp/
[2]https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_by_Sinnoh_Pok%C3%A9dex_number
[3] https://hookedondata.org/pokemon-type-combinations/
If you’re new to Medium, welcome! If you enjoyed this article and want to enjoy full access to my other stories, as well as unlimited access to stories from other Medium writers, please consider becoming a Medium Member using my personalized link below; I will earn a portion of your membership fee at no extra cost to you: https://medium.com/@jashahir/membership