The world’s leading publication for data science, AI, and ML professionals.

How to Solve Optimization Problems with Python

How to use the PuLP library to solve Linear Programming problems with a few lines of code

Photo by Charles Deluvio on Unsplash
Photo by Charles Deluvio on Unsplash

Linear programming (or linear optimization) is the process of solving for the best outcome in mathematical problems with constraints. PuLP is a powerful library that helps Python users solve these types of problems with just a few lines of code.

I have found that PuLP is the simplest library for solving these types of linear optimization problems. The objective function and constraints can all be added in an interesting layered approach with just one line of code each. This means that we can spend less time coding and more time solving the problem. The documentation is also easily readable and includes five easy to follow case studies.

In this article, we will use daily fantasy sports (DFS) data from Fanduel to demonstrate how to solve a maximization problem with multiple constraints. The intention is that these steps will be generalizable to other problems you would like to solve.

We will be working with DFS data because it allows us to walk through the entire process from understanding a real-world problem to defining the problem in terms of an objective function and constraints, to finally coding a solution in Python. All of these steps are an important part of any linear programming problem. DFS is a simple enough context to understand these steps while still being complex enough to allow for discussion about them.

If you would like to follow along, the data is freely available by following the steps below:

  1. Create a free fantasy account at Fanduel
  2. Go to the NBA tab within the Lobby
  3. Click on any of the contests below and click on the "enter new lineup" button
  4. Finally, click on "Download Player List" at the top of the page to get the data as a csv file
Photo by JC Gallidon on Unsplash
Photo by JC Gallidon on Unsplash

Background about DFS in the NBA

Before we get into the article, we will quickly look at the way that Fanduel structures their contests for the NBA.

This background will form the foundation for how we would like to set up our constraints for the problem we are trying to solve. It is always necessary to understand the problem in linear programming before sitting down to actually write code. This helps us form our constraints and objective function when we sit down to write the code.

The goal is to build a lineup of 9 players that scores the most points possible. Players earn points by doing successful things in the game for that day like scoring points or getting a rebound and lose points for negative actions like turning the ball over. Each basketball player is given an imaginary salary (from Fandeul) for that day and you are given $60,000 to allocate toward these players. You must select 2 point guards, 2 shooting guards, 2 small forwards, 2 power forwards, and 1 center.

Photo by rupixen.com on Unsplash
Photo by rupixen.com on Unsplash

The Problem We Want to Solve

Now that we have a good understanding of the problem we are trying to solve, lets formally define it with our objective function:

  • Maximize Projected Points from our 9 Players.

and constraints we would like to add in our problem:

  1. Only buy a player a maximum of 1 times.
  2. Own 2 point guards, 2 shooting guards, 2 small forwards, 2 power forwards, and 1 center.
  3. Spend no more than $60,000.

Using PuLP to Solve this Problem

We can now get started with actually writing code to solve this problem. Since this is an article about optimization (and not one about projecting outcomes), we will use the average points scored by each player as their projected points for today. If we were building a real optimizer for Fanduel, we would want to refine our estimates to include other variables like the matchups and projected playing time for each player.

Initial Setup

To get started let’s install the package using pip in the command line:

pip install pulp

and import necessary packages in our Jupyter notebook or IDE:

from pulp import *
import pandas as pd

We will then read in our data using pd.read_csv() giving us a pandas DataFrame including Nickname (player’s name on Fanduel), FPPG (average number of points scored per game by this player), Salary, and Position variables we will call data.

Setting Up Data Structures

We now need to define our variables using dictionaries as these are the data structures that PuLP uses:

# Get a list of players
players = list(data['Nickname'])
# Initialize Dictionaries for Salaries and Positions
salaries = dict(zip(players, data['Salary']))
positions = dict(zip(players, data['Position']))
# Dictionary for Projected Score for each player
project_points = dict(zip(players, data['FPPG']))
# Set Players to Take either 1 or 0 values (owned or not)
player_vars = LpVariable.dicts("Player", players, lowBound=0, upBound=1, cat='Integer')

All but the last lines set up dictionaries pointing player names stored in Nickname to other variables we are interested in.

The last line uses LpVariables which defines variables associated with the second argument (in this case players) numeric values. The other parameters define what values player_vars can take on. The parameter cat can be set to 'Integer' or 'Continuous'. We are left with a dictionary pointing names of players to integers (which we will use to indicate if we own the player or not with values of 1 or 0 respectively).

Defining The Problem

Next, we need to setup our problem using LpProblem() :

total_score = LpProblem("Fantasy_Points_Problem", LpMaximize)

The first argument is the name of the problem and the second argument is a parameter called sense which can either be set to LpMinimize or LpMaximize. We use LpMaximize since we are trying to maximize our projected points.

After we have defined the problem, we add our objective function using lpsum():

total_score += lpSum([project_points[i] * player_vars[i] for i in player_vars])

our salary constraint:

total_score += lpSum([salaries[i] * player_vars[i] for i in player_vars]) <= 60000

and our position constraints:

# Get indices of players for each position
pg = [p for p in positions.keys() if positions[p] == 'PG']
sg = [p for p in positions.keys() if positions[p] == 'SG']
sf = [p for p in positions.keys() if positions[p] == 'SF']
pf = [p for p in positions.keys() if positions[p] == 'PF']
c = [p for p in positions.keys() if positions[p] == 'C']
# Set Constraints
total_score += lpSum([player_vars[i] for i in pg]) == 2
total_score += lpSum([player_vars[i] for i in sg]) == 2
total_score += lpSum([player_vars[i] for i in sf]) == 2
total_score += lpSum([player_vars[i] for i in pf]) == 2
total_score += lpSum([player_vars[i] for i in c]) == 1

Solving the Problem

Once we have defined the problem, we can solve the problem with one line of code!

total_score.solve()

Once we have done this, our optimized variables are stored in a list by calling total_score.variables(), our values for each player are stored in the variable varValue, and the names of our values are stored in the name variable of each of the variables. Thus, we can print our lineup by finding the players with non-zero values as seen below:

for v in total_score.variables():
    if v.varValue > 0:
        print(v.name)
Photo by Wadi Lissa on Unsplash
Photo by Wadi Lissa on Unsplash

Conclusion

We are now able to solve complex linear programming problems with PuLP in Python! Once we understand the problem we are trying to solve, we can solve it in just a few lines of code using this library.

Linear optimization is an important component of many fields such as Operations, logistics, capital allocation, etc. The best way to learn a skill like this is to work through a problem on your own. You can use the same steps that we walked through above:

  1. Understand the problem
  2. Define the problem in terms of an objective function and constraints
  3. Solve the problem using PuLP

I encourage you to apply these steps to a problem that you find interesting and I’m excited to hear about what projects you work on in the comments below!

Thank you for taking the time to read this article and good luck on your next linear Programming problem.


Related Articles