The world’s leading publication for data science, AI, and ML professionals.

A Genetic Algorithm to build the best NFL Roster

In this story I explain how Genetic Algorithms work and show my implementation to build the best NFL roster!

It is no secret that analytics is playing an increasingly larger role in sports as time goes by. I, as both a data scientist and football head coach, am deeply interested in the intersection between these areas. In this study I present a Genetic Algorithm developed from scratch in Python to answer the following question: Respecting the salary cap, what is the best possible 53-man NFL roster that can be built? Even if you’re not really into sports or don’t understand the NFL deeply enough to interpret the results, the introduction section should still serve as a nice opportunity to learn a little more about Genetic Algorithms.

Entire solution is available in GitHub: https://github.com/lukmoda/genetic-algorithm-NFL

Introduction

First of all, let’s talk a little bit about Genetic Algorithms in a general way.

Genetic Algorithms

Genetic Algorithms are a subclass of a larger family of Algorithms known as Evolutionary Algorithms. GAs are inspired by Charles Darwin’s concepts of evolution to solve problems of optimization and search in a finite set. Although GAs generally don’t reach the global maximum/minimum, they often provide high-quality solutions to problems that have a very large number of different combinations -like the classical Vehicle Routing Problem (VRP), where one has to find the optimal route out of hundreds, if not thousands of possibilities. GAs also often have restrictions that limit which solutions are valid; say, for instance, you have to load a truck with orders from clients. Each product has a different value and occupies a different volume. In this case, the GA will try to find the combination of products that maximizes the value you carry inside the truck, without surpassing the volume of the trunk.

Basically, Genetic Algorithms utilize evolution concepts such as selection, crossover and mutation to improve randomly generated solutions at each generation, until convergence (or another stopping criteria) is reached. Let’s take a look at GA’s concepts then 😉

Individual, Population, Gene and Chromosome

The Individual is the solution itself: It contains all the parameters of the Genetic Algorithms. The first step in every GA is to initialize the Population, randomly generating a predefined number of Individuals.

Each Individual is represented by a set of strings (usually "0"s and "1"s), that represent the combination chosen from all the available space: this is a Chromosome (in Python, a Chromosome would be a an array of 0s and 1s). The smallest part of the chromosome is a Gene (a single 0 or 1 value, or a single element at an specific index of the array).

A Population of four Individuals (A1, A2, A3 and A4), with its Chromosomes (the arrays) composed of Genes (the values). Image taken from: source
A Population of four Individuals (A1, A2, A3 and A4), with its Chromosomes (the arrays) composed of Genes (the values). Image taken from: source

Fitness and Parent Selection

Since GAs are Optimization techniques, we need a function, metric or something that we want to optimize. This is often called Fitness Function. In our previous example, our Fitness would be to maximize product value without exceeding the truck space. If the search space is not extremely large and it is well-defined, GAs tend to find near-optimal solutions. Basically, what changes from one algorithm to the other is the inputs (the parameters needed to build Individuals and Chromosomes) and the fitness.

Genetic Algorithms work by combining Individuals with high fitness ("survival of the fittest") and make them produce offspring. In that way, across Generations, the algorithm tends to favor good genes (and chromosomes) and converge to the near-optimal solution.

However, as in life and society, diversity is key. One can not simply always take the best individuals and combine them, since this produces close solutions in very few generations and a loss of diversity (actually, this process is known as Premature Convergence).

With that in mind, the best alternative is to select the parents according to their fitness. That’s called Fitness Proportionate Selection. Individuals with higher fitness have a higher probability of being chosen to be parents – then, fitter Individuals have a better chance to mate and propagate their features to the next generations, but diversity is still preserved. There are a lot of approaches to Parent Selection (including Random Selection), but the most popular is the Roulette Wheel Selection.

In this method, Individuals are split into a pie of n slices, where n is the number of Individuals in the population. The size of the slice is proportional to the fitness value – then, fitter Individuals have a bigger portion of the pie. The sum S of all fitness values is calculated, then a random number between 0 and S is generated. Starting from the top of the population, keep summing the fitness values until the partial sum is greater than S. The individual for which this happens is chosen as the parent. Hence, the probability that an Individual is chosen as parent is directly proportional to its fitness. The below diagram (taken from https://www.tutorialspoint.com/genetic_algorithms/genetic_algorithms_parent_selection.htm, where a very detailed explanation of GAs can be found) illustrates the method:

The Roulette Wheel Selection. Image taken from: source
The Roulette Wheel Selection. Image taken from: source

Crossover and Mutation

Okay, so we have selected the parents. But how exactly do they mate, producing new Individuals for the next generation?

Crossover is just a fancy name for reproduction. As in biology, the parent’s genes are combined to form a new chromosome. As in Parent Selection, there are different ways to perform Crossover in GAs, with One Point Crossover being the most popular.

In this method, a random cutoff point is selected. Then the genes to the right of this point from Parent A and the genes to the left from Parent B (or vice-versa) are swapped.

One Point Crossover. Note that the cutoff point doesn't need to split the chromosome in half. Instead, it is a randomly generated value. By combining genes from Parents, Children have different chromosomes. Image taken from: source
One Point Crossover. Note that the cutoff point doesn’t need to split the chromosome in half. Instead, it is a randomly generated value. By combining genes from Parents, Children have different chromosomes. Image taken from: source

Different Crossover methods can be applied according to the user’s needs. Another very important Genetic Operator is the Mutation.

Mutation is introduced to reproduce the same effect as the homonym term in biology. Mutation helps the algorithm to explore different solutions, increasing genetic diversity. However, mutation should be introduced with a low probability (usually below 10%), or else the GA is reduced to a random search.

As you might have already guessed, there are a variety of methods to introduce Mutation. Two of the most popular ones are Bit Flip Mutation (select one or more random genes and swap their values – 0 to 1 and vice-versa) and Swap Mutation (select two genes at random and swap their positions in the chromosome). Of course, there are many more methods.

Bit Flip Mutation (top) and Swap Mutation (bottom). Image taken from: source
Bit Flip Mutation (top) and Swap Mutation (bottom). Image taken from: source

Survivor Selection and Stopping Criteria

The final steps are to select which Individuals remain in the Population, and when the algorithm should stop. The most common way to select survivors is to replace a certain number of Individuals of the Population with the lowest fitness by the children with highest fitness. GAs that require that the fittest (or the n fittest) member of the Population propagates to the next generation are said to employ Elitism.

Finally, GAs can be stopped according to three criteria:

  • When a certain number of generations is reached;
  • When the fitness function reaches a certain value;
  • When there has been no improvement in the Population after a certain number of consecutive generations (that is, the solutions are not improving, similar to Early Stopping in Neural Networks).

Complete Implementation

The complete implementation of a Genetic Algorithm can be seen in the below diagram:

Genetic Algorithm full implementation. Image taken from: source
Genetic Algorithm full implementation. Image taken from: source

First of all, the Initial Population is created, with a predefined number N of Individuals. Then, we evaluate the Fitness of each Individual. Until the Stopping Criteria is not met, we perform Parent Selection, Crossover and Mutation. Then we evaluate the new Population and replace Individuals according to the Survivor Selection strategy. At each generation we check the Stopping Criteria and, when it is met, we stop the algorithm and return the Best Individual in the Population.

Methodology

The Problem: NFL Team Building

The NFL is a little different than others sports and leagues. In contrast to soccer, for instance, there are no heavyweights that have way more money to hire and pay the best players (like Real Madrid, PSG, Bayern, Barcelona, etc). There is no relegation as well.

The NFL is composed of 32 franchises, and all of them must respect the salary cap – this means that all franchises have the same amount of money to spread around their player’s contracts. There is no "buying" players as well. Players enter the NFL through the Draft: An ordered, 7 round selection process where the best College players make the transition to the league. The team with the worst record in the previous year holds the first pick of each round, while the champion holds the 32nd pick. This mechanism ensures that the worst teams can recruit the best young players to help them improve, which brings balance to the league (of course, you have to know how and who to draft!).

A player can get out of a team if his contract expires (he then becomes a Free Agent and can sign with any franchise) or if he is traded – we can have a player-for-player swap or a player-for-draft-picks deal (or player+draft picks as well). In fact, despite the Draft has 7 rounds, a team can have more or less than 7 selections, depending on trades it has made. It can have no 1st round pick as well (Texans and Rams, I’m looking at you). Trades play a pivotal role in team building in the NFL, and this is a strategy that is growing in the recent years (Amari Cooper to the Cowboys for 2019 1st round pick; DeForest Buckner to the Colts for 2020 1st round pick; Jalen Ramsey to the Rams for 2020 AND 2021 1st round picks; and the ABSURD robbery of DeAndre Hopkins to the Cardinals for a 2020 second rounder and an aging David Johnson with a MASSIVE contract and a long injury history – seriously, Houston, WHAT WERE YOU THINKING?).

Houston, how do you let that guy walk out of the building??? Image taken from: source
Houston, how do you let that guy walk out of the building??? Image taken from: source

The fact is that team building in the NFL is a delicate art. It is a complicated balance between talent and money. There is no secret formula, and each General Manager approaches it in a different way. There are those that stockpile proved veterans and don’t care about not having first-rounders or many draft picks at all (like the Rams); there are those that build a young core and like to develop the team through good drafts (like the Colts); and there are those that have no idea what they are doing, signing the wrong players to massive contracts while giving away valuable picks (OMG HOUSTON) or being absolute duds on draft day, consistently picking in the top 10 and consistently selecting the wrong guys (Jaguars, Browns, Lions…).

Having a high draft pick (or a lot of picks) doesn’t always translate to success. A lot of factors influence the success of players, from coaching and team culture to fitness in the team scheme or even the city’s weather. Being able to identify "hidden gems" in the later rounds and managing to develop "raw" talent into solid players is also as import as being financially responsible (and it is what made the Patriots dynasty possible).

With all that being said, it has become pretty clear that building a successful team and program in the NFL is reaaaaally hard. Sometimes, you have to make hard decisions if you hand 30 million to your Quarterback. Sometimes, you can’t keep everyone around at their asking price. This is why people always urge the teams to do everything they can to win a championship when their Quarterbacks are on their rookie deals (rookie deals last 4 or potentially 5 years and are much cheaper) – like the Chiefs in 2019 with Patrick Mahomes and the Seahawks in 2013 with Russell Wilson. When you have to pay the big bucks to your star QB, there is less money to build talent around him.

Despite of all the QB praise, we have seen teams with average QB play being carried to a title by FANTASTIC defenses (1985 Bears, 2000 Ravens, 2002 Buccaneers and even the 2015 Broncos in Peyton Manning’s dramatic final year). Others, like the 1999 "Greatest Show on Turf" Rams, were lead by a high-scoring offense full of playmakers. The Washington Football Team won 3 Super Bowls in a 10 year span between 1982 and 1991 with 3 different (and average) Quarterbacks, instead relying on a very powerful run game. The 2017 Eagles lost their star QB (and MVP front-runner) just four games before the playoffs and still won the title against the mighty Tom Brady’s Patriots – carried by great coaching, very solid lines and a little bit of Foles magic. Those same Patriots were at the verge of the first perfect season in 2007, but fell short in the Super Bowl to the Giants – who had won only 9 games and barely made it to the postseason. In that game, the Giants defensive line smothered Brady and slowed that historic offense, being the key (along with some Eli-to-Tyree magic) to the huge upset.

Never forget Philly Special! Image taken from: source
Never forget Philly Special! Image taken from: source

My point is: not all champions are built equal. And that’s what drove me to try to answer that question: how can one build the perfect NFL roster? How can one maximize talent and still be under the salary cap? That’s what we will see in the Genetic Algorithm implementation.

Data Sources

For this study I used two databases. I needed some kind of rating to decide who are the better players. For that, I used a spreadsheet (https://www.reddit.com/r/Madden/comments/ht8a04/madden_21_player_ratings_spreadsheet/) containing Madden 21 ratings for all the players in the game at the start of the 2020 season. I also thought about using PFF (Pro Football Focus) grades, but I couldn’t find a database and there was also the problem that rookies and seldom used reserves don’t have grades.

I also needed the salaries. For that I used Over The Cap (https://overthecap.com/contracts/). A caveat is that I used the Average/Year of the contract, not the Cap Hit of 2020 (I couldn’t find that number for all players). In practice, GMs manage huge contracts by diluting their value over the years (Mahomes record-breaking 10 year, $450 million contract, for instance, is $45 million on average/year, but the QB is due "only" $24.8 million in 2021 and a whopping $59.95 million in 2027). Also, the salary cap always increases each year (we don’t know about 2021 because of the pandemic), adding roughly $10 million each year from 2013 until now. To account for that effect I used a slightly higher salary cap in the simulations ($220 million as opposed to the actual $198.2 million), to make up for that contract-managing flexibility.

Salaries reflect players under contract by Week 9 of the 2020 NFL season (the time the study was developed), so there are players that won’t appear in the simulations. Some notable names (with their Madden Overall Rating in parenthesis) left out are: Earl Thomas (88), Gerald McCoy (85), Derrius Guice (81), Jonathan Joseph (81), Mohamed Sanu (79), Dontari Poe (78), Ha Ha Clinton-Dix (78), Prince Amukamara (78), Dante Pettis (77), Quincy Enunwa (77) and Eli Apple (75).

Restrictions

Our Individuals consist of 53 players randomly chosen from the 1746 available. The chromosome is an array of size 1746 with 53 "1" values.

The first restriction is that the sum of the 53 players’ salaries _c_an’t be higher than the salary cap. In practice, when developing a Genetic Algorithm, you don’t throw away an Individual that doesn’t satisfy restrictions; instead, you give him a very low fitness value, which makes it hard for him to mate and propagate the genes through generations.

The second restriction regards the positions of the 53 man roster. The players chosen must have a plausible position distribution – it doesn’t make sense, for instance, to build a roster with 8 QBs, or 2 LBs, with 18 WRs, etc. Every valid solution (otherwise, Individuals receive a very low fitness value) should have:

  • 2 or 3 Quarterbacks;
  • 2 or 3 Centers;
  • Between 5 and 8 Wide Receivers;
  • Between 2 and 5 Tight Ends;
  • Between 3 and 5 Running Backs;
  • Between 3 and 5 Guards;
  • Between 3 and 5 Tackles;
  • Between 6 and 9 Defensive Lineman;
  • Between 6 and 10 Linebackers;
  • Between 7 and 11 Defensive Backs;
  • No more than 1 Kicker;
  • No more than 1 Punter.

Since there are very few Punters and Kickers in the players pool, I did not require the solution to include these positions (Fullbacks as well). However, there can’t be a team with 2 (or more) Kickers or 2 (or more) Punters. I made some simplifications as well:

  • Did not differentiate between Right Tackles and Left Tackles;
  • Did not differentiate between Defensive Tackles and Defensive Ends;
  • Did not differentiate between Inside Linebackers and Outside Linebackers;
  • Did not differentiate between Cornerback, Strong Safety and Free Safety.

These simplifications were made purely by computational reasons. The more specific you are, the less Individuals in your initial population will meet your criteria. Thus, to have good genetic variability, you would need to have a large number of Individuals, increasing the time and computational power. Also, defenders have a certain flexibility and there is the difference between 3–4 and 4–3 schemes, so it is not always simple to tag a player as a very specific position (what is Isaiah Simmons’ position, for instance?). I did some tests and reached a compromise by grouping the positions in the aforementioned way.

After conducting more tests and peeking at the teams selected, I noted a few more tweaks were necessary, as follows:

  • The backup Quarterback can’t have more than 77 overall (I noticed a solution had Drew Brees and Matthew Stafford, which makes 0 sense);
  • The backup Center cant’ have more than 75 overall (a solution had Jason Kelce and Rodney Hudson. The Center and Quarterback are two very specific, "solitary" positions, so there is not a lot of switching. Thus, it doesn’t make sense to have two star Centers);
  • The 3rd Running Back can’t have more than 78 overall (one-two punches in the backfield such as Chubb-Hunt in Cleveland and Gordon-Lindsay in Denver are allowed, but I saw a solution that had Aaron Jones, Todd Gurley and Joe Mixon…)

These are the restrictions. Now, let’s see the definition of fitness.

Evaluation and Strategies

At first, I thought of the fitness function to simply be the average Madden 21 overall rating of all the 53 players chosen. I noticed, however, that this doesn’t necessarily translate to the best possible team. If you take the simple average from all players, you are giving the same weight to your starting QB and your 53rd player (like, your 7th LB that only plays on special teams).

Can you imagine where the Bears would be if they had drafted Watson or Mahomes? Image taken from: source
Can you imagine where the Bears would be if they had drafted Watson or Mahomes? Image taken from: source

Yes, you will have the most balanced 53 man roster, but is it really worth to focus on having a good 3rd stringer, rather than an elite starter? With that in mind, and remembering the different strategies employed by previous champions, I realized that the fitness value depends on the strategy of the General Manager! That is, your approach to team building influences which players and which positions you value most (and thus, should have higher weights). Then I built not one but six different fitness functions, each representing a different approach to team building. In practice, each strategy attributes a different weight to certain positions, and then the fitness function is calculated using a weighted average. The strategies are:

  • Balanced: Tries to assemble the most well-rounded 53 man roster. Every player has the same weight (1);
  • Elite QB: Focus on having one of the top QBs of the league; the rest of the team builds around him. The starting QB has weight 10;
  • Playmakers: Focus on having a solid QB and a very good core of skill players. Starting QB has weight 5 and top 3 WR, top HB and top TE have weight 8;
  • Defense: Focus on having a top-caliber defense (using a 4–3 system). Top 4 DL, top 3 LB and top 4 DB have weight 5;
  • Trenches: Focus on having top-caliber Offensive Line and Defensive Line. Top C, top 2 G, top 2 T and top 4 DL have weight 5;
  • Starters: Focus on having the best 22 starters possible (based on a 4–3 scheme and 11 personnel on offense). All starters have weight 5, except for the QB (weight 8).

Simulations

First of all, before joining the Madden 21 ratings and Over The Cap databases (on player name and team), there was some cleaning needed (like removing Jrs, III, IV and manually fixing the team of traded players like Jamal Adams and Desmond King).

The Individual class has the following attributes: number of players (1746), roster space (53), cap limit ($220 million), strategy, generation, grade, cap used, positions (array), ratings (array) and salaries (array). The Individual evaluation method checks if the sum of salaries is bigger than the cap space and if the positions meet the restrictions – if not, the grade is 1. Otherwise, grade is calculated according to strategy chosen.

I used one-point crossover, but after the operation I needed to check if the chromosome still had 53 "1s" (because depending on the cutoff point there could be more or less than 53 players chosen, which I didn’t allow). I then adjusted so that children always had 53 players.

For mutation, it occurs if a randomly generated number is smaller than a defined constant named mutation rate. I applied Bit Flip Mutation to a number n of genes defined by another constant named mutation power. To ensure that solutions have 53 players as well, n genes with 0 value are randomly chosen and turned to 1, and vice-versa for genes with 1 value.

The GA (Genetic Algorithm) class is initialized by passing the population size as argument. Its fit method initializes the population, calculates fitness for every Individual, and for each generation (I adopted the number of generations as stopping criteria) parents are selected through the Roulette Wheel Selection and generate two children (applying crossover and mutation); the whole population is substituted by the offspring. The new population is then re-evaluated and the Individual with the best fitness is stored in the _bestindividual attribute.

I ran 600 simulations (so, there are 600 rosters. If you calculate how many 53 man rosters you can build with 1746 players, you will get to something in the order of 10E101 combinations! However, since we have a lot of restrictions on positions, this number is substantially smaller. Even still, 600 simulations are not enough to explore the whole search space and thus solutions are sub-optimal), 100 for each strategy. Every simulation had a population size of 1000 Individuals, 5% mutation rate, a mutation power of 20 and ran for 70 generations. They were executed in a AWS c5d.large EC2 instance.

Whew! That was a lot… Now to the fun part, we can finally see the results 🙂

Results

Top and Bottom Teams

We can start by looking at which franchises had the most representatives in the squads formed – note that if the players were uniformly distributed across teams, we would expect to see roughly 993 players per team (600 simulations * 53 players / 32 teams).

Top 10 Teams considering all 600 simulations. Image by the author.
Top 10 Teams considering all 600 simulations. Image by the author.

Looking at all solutions, playoff caliber teams like the Ravens, Patriots (don’t forget they had _a l_ot of COVID-19 opt-outs), 49ers, Seahawks and Saints round the top 5. The Colts, Raiders and Chargers have well-rounded rosters, so Washington and the Lions were a bit of a surprise. Let’s see what changes when we consider the Starters scenario:

Top 10 Teams considering only Starters simulations. Image by the author.
Top 10 Teams considering only Starters simulations. Image by the author.

Now this makes a little more sense. The Lions are still there (perhaps this is a sign that the roster is not as bad as the record indicates and Matt Patricia really needs to go. There’s a chance he already has by the time you’re reading this), but now we have the Chiefs, Packers and Broncos (they look bad but they have had terrible luck with injuries) as well. Let’s check what changes if we look at the defenses:

Top 10 Teams considering only Defense simulations. Image by the author.
Top 10 Teams considering only Defense simulations. Image by the author.

It makes a lot of sense. The Steelers, Rams, Bears, Bills and Vikings all boast good (if not great) defenses. The Seahawks defense has been historically bad in 2020, but hey… They have Bobby Wagner, Jamal Adams and KJ Wright. Perhaps most surprisingly appears the Washington FT, led by its monstrous defensive line. The Colts and Buccaneers appeared at positions 20 and 24 respectively, which was also a huge surprise given how well they’ve been playing— maybe a sign that Madden ratings underestimated their potential. And how about the Playmakers?

Top 10 Teams considering only Defense simulations. Image by the author.
Top 10 Teams considering only Defense simulations. Image by the author.

Here the new teams are the Chiefs, Bills, Buccaneers and Cowboys, which all make sense given how many good playmakers they have. Now on to the Trenches:

Top 10 Teams considering only Defense simulations. Image by the author.
Top 10 Teams considering only Defense simulations. Image by the author.

Here the top 3 teams look very different! The Colts and Raiders excellent Offensive Lines make them soar, while WFT’s Defensive Line also shines. The Browns (Myles Garrett and a very solid OL), Eagles (an excellent OL when healthy and Fletcher Cox and others on the DL) and Bears (Akiem Hicks, Eddie Goldman and the DL) make an appearance as well.

In the Elite QB simulations the teams are the same as in General, with the Chiefs (led my Mahomes) taking the Saints spot and the Seahawks (led by Wilson) taking the first place.

Now looking at the bottom 10 teams I will just show the General and write the rest:

Bottom 10 Teams considering all 600 simulations. Image by the author.
Bottom 10 Teams considering all 600 simulations. Image by the author.

Some teams are notoriously bad and have many roster roles – like the Panthers, Giants, Texans and Jets. I can understand the Dolphins being there as well, since Brian Flores is coaching a bunch of "no ones" into a very tough, disciplined team that overcomes low Madden ratings into a lot of wins. Despite having a lot of playmakers and Matt Ryan on offense, Atlanta’s inept defense justifies their spot as well. The Cardinals, Titans and Packers were a surprise to me, but maybe this speaks more about their roster depth than their starting quality.

When we look at the Defense, we have the Chiefs, Giants, Cowboys, Eagles, Browns, Jaguars and Bengals occupying positions 31 to 25, which, analyzing their play in the 2020 season, it’s not a surprise. Despite having Aaron Rodgers, the Packers were the 22nd team in the Playmakers strategy (indicating that indeed Green Bay needs other weapons other than Aaron Jones and Davante Adams).

Now that we have analyzed the teams, let’s analyze the players!

Top Players

First of all, I have to note that I don’t agree with all Madden Ratings. For instance, Lamar Jackson is an outrageous 94, well above Kyler Murray (77) and rookie sensations Joe Burrow (76) and Justin Herbert (70). In the Git Hub repo you can change the overalls to whatever you want and re-run the simulations (you can change the weights for each strategy as well). However, I needed a measure of performance and the rating are "independent" from my beliefs, so I left them untouched for this study. Given all that, here are the top 30 players that appeared in the most squads considering all the 600 simulations:

Top 30 players considering all 600 simulations. Image by the author.
Top 30 players considering all 600 simulations. Image by the author.

With a 94 overall and playing on a cheap rookie contract, it’s no surprise that Lamar Jackson is number 1. Other young players like Quenton Nelson, Ryan Ramczyk, Fred Warner, Orlando Brown, Chris Godwin, Calvin Ridley, Derwin James, Josh Allen and Courtland Sutton are present as well. Veteran Calais Campbell, with 95 overall and a $10 million contract, surges as number 2. Superstar Saints WR Michael Thomas rounds the top 5 with his 99 overall. Other stars with big contracts like Cameron Jordan, Bobby Wagner, George Kittle, Tyreek Hill, JJ Watt, Zack Martin and David Bakhtiari made the top 30 as well. Perhaps the biggest surprise is pass-catching veteran RB Chris Thompson, who is playing on a cheap $1.4 million contract for the Jaguars and boasts a respectable 76 overall.

What if we see the best players for the Defense strategy?

Top 30 players considering only Defense simulations. Image by the author.
Top 30 players considering only Defense simulations. Image by the author.

Here we can clearly see the impact of the strategy in team building. 29 of the top 30 players are superstar defenders, from Calais Campbell, Bobby Wager, JJ Watt, Chandler Jones, Khalil Mack and Nick Bosa to Demario Davis, Tyrann Mathieu, Deion Jones, Eric Kendricks and Justin Simmons. That’s nice! Can you already tell which players will appear on the Trenches strategy?

Top 30 players considering only Trenches simulations. Image by the author.
Top 30 players considering only Trenches simulations. Image by the author.

Here are the big guys! Important to note that the Offensive Linemen have been more predominant than the Defensive Linemen, which is a testament to the scarcity of top-notch OL (Brooks, Martin, Schwartz, Bakhtiari, Incognito, Nelson, Armstead, …) in the league. Now, on to the Playmakers:

Top 30 players considering only Playmakers simulations. Image by the author.
Top 30 players considering only Playmakers simulations. Image by the author.

Well, who doesn’t want an offense With MT, Kittle, Julio Jones, DHop, Tyreek Hill, Kelce, Gronk, Davante Adams, CMC, Nick Chubb, Diggs, TY Hilton, Amari Cooper?

Now that we have seen the players, let’s take a peek at the top Quarterbacks.

Top QBs

First, let’s look at the general picture:

QBs considering all 600 simulations. Image by the author.
QBs considering all 600 simulations. Image by the author.

We see obviously Lamar at first place, followed by Mahomes and Wilson. Brees is number 4, Brady is number 12, Rodgers is number 16 and Matt Ryan comes at 17. Inside the top 10 we see some contested names such as Mayfield (5), Cam Newton (aided by his cheap contract, at 6), Dwayne Haskins (7), Jameis Winston (also on a $1 million contract, at 9). The great Trubisky comes at 11. Big Ben (23), Dak Prescott (30), Matthew Stafford (42), Deshaun Watson (45) and Ryan Tannehill (53) were big surprsises. But what if we analyze the QBs of the Elite QB strategy?

QBs considering only Elite QB simulations. Image by the author.
QBs considering only Elite QB simulations. Image by the author.

Out of the 100 simulations, there were only 13 distinct QBs. And frankly, all of them are at least capable (Cousins, Jimmy G, Cam and Wentz). Lamar (27 simulations), Wilson (26 simulations) and Mahomes (22 simulations), however, appear on 75% of the simulations. Rodgers (6), Brees (6) and Brady (3) come right behind. All in all, it shows that the strategy works. The Playmakers strategy also put a higher weight on the starting QB. Let’s see:

QBs considering only Playmakers simulations. Image by the author.
QBs considering only Playmakers simulations. Image by the author.

Now there are 40 distinct QBs for the 100 simulations. The scenario is a little better than the general picture, but not as good as the Elite QB. Lamar (11) and Kyler Murray (7) make a solid top 2, but after it we have Cam (6), Winston (5), Brady (4), Wilson (4), Tua Tagovailoa (4), Drew Lock (4) and Haskins (4). Mahomes, Rivers and Brees appear on 3 simulations. Dak and Matt Ryan show up on 2, while Rodgers, Watson and Big Ben only appear in one simulation. Guys like Jeff Driskel, Chase Daniel, Will Grier, Brett Hundley and even BLAINE GABBERT each found a spot on one unfortunate team as well.

It has become clear from the names how the strategy influences which players will form your roster. So, can you imagine the QBs that appear in the Defense strategy?

QBs considering only Defense simulations. Image by the author.
QBs considering only Defense simulations. Image by the author.

Lamar and Big Ben share the third position with 4 simulations each (and Rivers, Burrow and Herbert with 3), but after that there is not much hope. Drew Lock (6) and Tyrod Taylor (5) are 1st and 2nd. Looking at the cloud there are not many great names that pop out – instead, we see the likes of Flacco, Daniel Jones, Kyle Allen, Mullens, Mayfield… Neither Mahomes, Wilson, Brees, Brady or Rodgers appear a single time. This is a strong evidence that it is practically impossible to build an all-time defense and still have a high-paid, elite QB. Instead, defense-centered teams should focus on winning with a young star on his cheap rookie contract, or sign a savvy veteran that can manage the game without hurting the team. Looking at the Trenches strategy, the trend is the same:

QBs considering only Trenches simulations. Image by the author.
QBs considering only Trenches simulations. Image by the author.

I hope it became clear how strategy impacts team building. Of course, one can change the weights and add other rules, try other strategies. I just gave the blueprint. Now, let’s quickly comment about the convergence of the simulations.

Generations Convergence

When we analyze at which generation the best solution was found, we can have an idea about the algorithm’s convergence. Ideally, the fitness value of the best solution should grow over time, and the best solution should be found in later generations. Since our problem is complex and we just scratched the search space, that didn’t happen all the time. Instead, we had four different scenarios, which are depicted in this figure:

Four different scenarios of Generation Convergence. These are four examples of four different simulations of the Starters strategy. Image by the author.
Four different scenarios of Generation Convergence. These are four examples of four different simulations of the Starters strategy. Image by the author.

At the top left we have Stability. Those are bad luck simulations in which all of the 1000 Individuals of the initial population did not meet the restrictions and thus had a grade 1. Acceptable solutions were only achieved through mutation, thus, the population had a very low genetic diversity and that’s why the fitness value of the best Individual basically doesn’t change over time. To mitigate that, one could check the restrictions before creating the initial population (and then only initialize the GA when 1000 valid Individuals are created) or have a larger number of Individuals in the population (say, 10.000). However, since I already mentioned, running with 1000 Individuals was already quite expensive time-and-computing wise. If one has more resources, though, it is recommended to run with a larger initial population.

In the top right we have Early Convergence. The best solution occurs in the first generations and then there is a sudden drop-off, and the algorithm never finds a solution that good again. My guess is that those simulations are affected by bad luck in mutation.

In the bottom left we have Oscillating. Solutions improve and then get worse and get back up again, never quite establishing a pattern. Those simulations (which were the least frequent of the four scenarios) tended to retrieve the best solution in the middle generations.

And finally, in the bottom right we have Constant Improvement. That’s the scenario we were hoping to find and, thankfully, it is the most frequent. Solutions tend to appear in later generations. If a deeper analysis identifies the factors leading to this type of convergence, more simulations can be conducted (and for a larger number of generations).

I hope that at this point you have a fairly good understanding of Genetic Algorithms and what I did. It is a fine art and there are a lot of parameter tuning and computational complications, so this work is not definitive at all – instead, it’s a starting point.

We are almost done! Before checking some of the solutions and marvel at the rosters your team could’ve had, let’s see the relations between different types of Overalls.

Overalls

I also tried to compare solutions according to different calculations of the Average Overall. The four different Overalls are:

  • Balanced: the average overall across all 53 players;
  • Defense: the average overall of the starting 11 defenders;
  • Offense: the average overall of the starting 11 players of offense + the 2nd Running Back and the 4th Wide Receiver (since Offenses have a little personnel variability);
  • Starters: the average overall of the starting defense and starting offense.

Let’s see the overall distribution of the 100 simulations for each strategy. We will be seeing one distribution plot for each Overall, starting with the Starters Overall:

Distribution of Starters Overall by Strategy. Image by the author.
Distribution of Starters Overall by Strategy. Image by the author.

As expected, the Starters strategy has the best Starters overall, ranging from 78 to almost 85 (and an average of 82). All other strategies have similar distributions with means around 80, and had solutions with lower ratings (up to almost 75).

What happens when we compare strategies considering the Defense Overall?

Distribution of Defense Overall by Strategy. Image by the author.
Distribution of Defense Overall by Strategy. Image by the author.

To surprise of no one, the Defense strategy resulted in higher Defense Overall (from 80 to almost 90!). The Starters strategy came second, producing defense overalls from 75 up to 88, but with a mean roughly 5 points lower than Defense strategy). Most importantly is to note that offense-focused strategies (Elite QB and Playmakers) had the worst Defense Overall distributions, with means of 80 and 78 and solutions going all the way down to 73. This is a clear indication of the contrast between building a stellar defense and stellar offense at the same time.

You can already imagine what the plot for Offense Overall is like, right?

Distribution of Offense Overall by Strategy. Image by the author.
Distribution of Offense Overall by Strategy. Image by the author.

Although the Playmakers and Elite QB lead the way (with ratings near 86), their distributions are closer to the Starters (and Trenches). The Defense strategy tends to produce much lower Offense Overall, reaching values below 70!!! Its mean (75) is also more than 5 points below the offense-based strategies. There is clearly a contrast between Offense and Defense in team building! Furthermore, the defense overall seems to impact the starters overall more than the offense overall, a sign that having a solid starting defense is a little more correlated with having a solid starting 22 than having a solid starting offense.

You probably noticed that the Balanced strategy (Cyan curve) always stays in the middle (not so good, but not so bad). Well, what happens when we analyze the Balanced Overall?

Distribution of Balanced Overall by Strategy. Image by the author.
Distribution of Balanced Overall by Strategy. Image by the author.

Aha! Indeed, the Balanced strategy tends to produce better solutions for optimizing balanced overall – although the Starters mean is basically the same, Balanced has more simulations to the right. Also note the values on the x-axis: they are dramatically different. The best solution is close to 76, while on Defense we had almost 90 overall!

All in all, those plots show evidence that the strategies are working as intended. They are a promising, powerful insight. Now, we will finally see some of the rosters built!

Solutions (Rosters)

Of course, I won’t show all the 600 squads. Instead, I’ll show two rosters per strategy, starting with Balanced.

First, the Balanced rosters with the highest fitness:

Balanced Roster with the highest fitness. Image by the author.
Balanced Roster with the highest fitness. Image by the author.

Indeed, it is a very well-rounded team, with no playmaker below 83 overall and the worst starting defender is Derek Barnett with a respectable 77. The biggest flaw of the team are the Tackles, both below 70. Even on the bench there are interesting options such as Mark Ingram, Burkhead, Dallas Goedert, Kirk, Devin White and Samson Ebukam. The team was gifted with the best kicker of the world as well (remembering that having a Kicker and Punter was not a restriction). Now the solution with the highest starters overall:

Balanced Roster with the highest starters overall. Image by the author.
Balanced Roster with the highest starters overall. Image by the author.

This is a very strong team as well, led by Tom Brady and Quenton Nelson on offense with a very good RB and three very good receivers, and with Calais Campbell, Lavonte David, Eric Kendricks, Derwin James and Chris Harris on defense. The bench is not nearly as deep, but this is still a very good team.

Now we analyze the Elite QB teams with the highest fitness and highest starters overall:

Elite QB Roster with the highest fitness. Image by the author.
Elite QB Roster with the highest fitness. Image by the author.
Elite QB Roster with the highest starters overall. Image by the author.
Elite QB Roster with the highest starters overall. Image by the author.

These are also two very strong squads, both quarterbacked by Lamar Jackson (the second one has Kittle, Davante Adams, Tyler Lockett and AJ Green!). A defense with Cam Jordan, Nick Bosa, Roquan Smith and Devin McCourty? I really liked the results from those simulations…

Now on to Playmakers:

Playmakers Roster with the highest fitness. Image by the author.
Playmakers Roster with the highest fitness. Image by the author.
Playmakers QB Roster with the highest starters overall. Image by the author.
Playmakers QB Roster with the highest starters overall. Image by the author.

The first squad has a very strong offensive core (Lamar, Derrick Henry, Kelce, Julio Jones, Davante Adams and AJ Green!), although the Guards and one of the Tackles are pretty bad. Outside of Dont’a Hightower and Micah Hyde the defense is uninspiring as well – hey, that’s the price you pay for having so much talent catching and running the ball! Although the second team is not as flashy, I think they’re more well rounded, with a very solid defense (Nick Bosa, Mack, Chandler Jones, Fred Warner, Jamal Adams and Desmond King is preeetty good!). If Cam can work it out with Saquon, Marvin Jones, Scary Terry (hey! There’s AJ Brown on the bench too!) and a decent OL, I like this team better.

What about the Defense strategy?

Defense Roster with the highest Defense overall. Image by the author.
Defense Roster with the highest Defense overall. Image by the author.
Defense Roster with the highest starters overall. Image by the author.
Defense Roster with the highest starters overall. Image by the author.

The first squad shows the best defense formed across all simulations (a whopping 88 overall!). JJ Watt, Michael Pierce, Grady Jarrett, Khalil Mack, Chandler Jones, Eric Kendricks, Desmond King, Adrian Amos… This is reaally good. The offense, however, is laughable. Outside of Zack Martin, Lockett and DJ Moore, the starters are grim. Damn, Chase Daniel is the starting quarterback. That’s the price you pay for having the best defense ever: Chase Daniel! The second team is much better, still boasting a top-notch defense (JJ Watt, Nick Bosa, Melvin Ingram, Hightower, Roquan and Harrison Smith, Sherman, King… Very good!) and a much better offense, with a better OL and led by Big Ben. The best WR has only 72 of rating, but with Bell in the backfield and this defense, I like this team very much as well!

On to the Trenches:

Trenches Roster with the highest fitness. Image by the author.
Trenches Roster with the highest fitness. Image by the author.
Trenches Roster with the highest starters overall. Image by the author.
Trenches Roster with the highest starters overall. Image by the author.

Both rosters are led by QBs below 80 (although Kyler Murray is actually way better than this) and have a few duds as playmakers (Mecole Hardman, Lee Smith, Breshad Perriman). Tevin Coleman and James Conner are not elite, but both are very capable Running Backs. Now, the Offensive Lines… Maaaan! Hudson, Brooks, Decastro, Schwartz and Armstead. Kelce, Brooks, Saffold, Schwartz and Bakthiari. I dare you to try to sack the QBs of those two teams. The first team also has a very good DL (Danielle Hunter, Hicks, Brockers and Jonathan Allen), but the LBs and DBs are average at best. The second team only has JJ Watt as a star name in the DL, but the rest of the defense is much better.

Finally, I will show the two best solutions of the Starters strategy, with a starters overall of 84.83 and 84.04:

Second best Starters Roster. Image by the author.
Second best Starters Roster. Image by the author.

Although the team has little depth on the bench, it is a very solid squad. Wilson basically has Saquon and four average guys to throw to (including his mate David Moore), but with four guys above 90 overall on his OL, I don’t think he will complain. The defense doesn’t really have a glaring flaw, with the worst starter being Brian Burns with 79 overall. This is a very good team!

Best Starters Roster. Image by the author.
Best Starters Roster. Image by the author.

Outside of George Fant, Lamar has an absurd OL and a bevy of weapons to work with (Josh Jacobs, Mike Evans, AJ Green and DJ Moore), making this a scary offense. The defense is not as strong as the previous squad, but there are still some big names. Derwin James and Minkah Fitzpatrick at the back-end, Kenny Clark, Vita Vea and Za’Darius Smith rushing the passer and Darius Leonard and KJ Wright patrolling the middle of the field…

Is this the best squad out of the 600? Of the 12 I presented, which was your favorite? Regardless, those are very interesting results and it was a very fun experiment!

Conclusion

Thank you for reading this long article until this point. I hope the concepts of Genetic Algorithms and the project were well explained. Even though the solutions are sub-optimal, they provide a clear path and prove that excellent rosters can be built by smartly managing salaries. Also, they show the impact of the different strategies on the rosters formed.

The project can be extended to explore different parameters (like mutation rate and mutation power), run for more generations and include a larger number of Individuals in the population. More computational power and time will also allow for including Punters and Kickers in the restrictions, and work with the positions without aggregations. Different weights and different strategies can be used. Even a different system for grading other than Madden 21 ratings.

I really hope this work can bring a better understanding of the complexity of team building in the NFL, and shed a light on how to approach it. There are still a lot of improvements to be made, but, at least, the blueprint is already there.


Related Articles