The world’s leading publication for data science, AI, and ML professionals.

Linear Programming: Auxiliary Variables

Part 5: Increasing LP flexibility to handle tricky logic

image from pexels.com by Pixababy
image from pexels.com by Pixababy

Auxiliary variables seem to be a topic that is often quickly rushed over in a lot of Linear Programming material. I think they are fascinating and quite powerful. Because of this, I decided to dedicate a short article to explaining and demonstrating how auxiliary variables are often used in linear programming.

Before we dive into auxiliary variables – I wanted to mention that this is the fifth part in a series I’m writing on linear programming (LP). To check out the other LP topics I’ve covered, use the link below:

Linear Programming

First of all, let’s address the question – what is an auxiliary variable? My definition of an auxiliary variable(in the context of LP) is ‘additional variables that are added to a linear programming problem, that allow the use of logic that otherwise would not be possible.’

Auxiliary Variable: Additional variables that are added to a linear programming problem that allow the use of logic that otherwise would not be possible.

Generally, auxiliary variables are used in clever ways to formulate non-linear relationships (typically a violation of the rules of linear programming) in the objective function and constraints in a way that still works for linear programming. In other words, they allow for linearization of functions that would be non-linear without them. Since there are multiple ways to use auxiliary variables, in this article, we’ll go over the most common ways I’ve seen. My goal is for you to have a good intuition on what auxiliary variables do and how you can use them in linear programming. The rest of the article will go through specific applications of auxiliary variables with detailed examples for each application. Note that this is not meant to be an exhaustive representation of how auxiliary variables are used in linear programming. It is meant to demonstrate some of the most common ways they are used.

Creating Piecewise Linear Functions

The jumps that happen in piecewise linear equations are a ‘no-no’ in linear programming. Thankfully, we can accomplish the equivalent of piecewise functions using auxiliary variables! Let’s get into an example.

Let’s imagine that you manage a manufacturing facility. You need to produce a minimum number of products while minimizing your company’s wage expenses. Each employee has an hourly wage and an hourly production rate. Your goal is to create a work schedule that minimizes wage expenses while meeting a specific number of units produced. Easy enough? There’s a catch! Any employee that works over 20 hours a week becomes eligible for benefits – these benefits will cost an extra $500 per employee. This represents a jump in the function with the same slope (hourly salary).

We will first explore how to perform an intercept jump. Let’s set up the simple problem without the intercept jump, then we can add the complexity of the extra benefits cost at 20 hours of work. In our simple example, we have three employees, they each have a different pay and a different level of productivity (shown in the table below). Our goal is to minimize labor costs while producing at least 600 units a week.

image by author
image by author

Before the additional complexity of the benefits expense, our objective function and constraint look like this:

LP set up before intercept jump - 1 objective function, 1 constraint
LP set up before intercept jump – 1 objective function, 1 constraint

Alright, with the LP 101 problem set up, let’s get into adding that intercept jump at 20 hours. We are going to use something called the ‘Big-M Method’ to do this. In this method, we create one or more binary auxiliary variables and we create an additional constraint to bind the auxiliary variable to the original decision variables. The ‘M’ part of ‘Big-M’ comes into play in the constraint configuration.

Below is an example of the new constraint we will add – y is the binary auxiliary variable that we added to the problem and M is an arbitrarily large number.

big-m constraint
big-m constraint

Let’s break the constraint down since this is the main concept of performing the jump. On the left we have the decision variable for the hours we will schedule employee 1. On the right, we have the number of hours after which benefits are required plus ‘M’ which is an arbitrarily large number (hence _big-_M) and y1 is our binary auxiliary variable. In our example, I’ll set M to be 1,000. Understanding this constraint is key! Let’s build our intuition by running different values through the inequality for employee 1 hours. See the table and explanations below:

explanation of the big-m method - image by author
explanation of the big-m method – image by author

In addition to the new constraint, we also have to make modifications to our objective function. This is pretty simple to do, we add the product of our new auxiliary variables (remember, they are binary) and the intercept jump, in this case $500, to the objective function. When the auxiliary variable is 0, no additional cost is added. When the employee works more than 20 hours, the auxiliary variable is forced to be 1 and $500 is added to the objective value for that employee.

Updated objective function for intercept jumps
Updated objective function for intercept jumps

With the problem fully formulated, let’s set it up and solve it in Python using the pulp package!

import pulp

# Define the problem
problem = pulp.LpProblem("Staff_Management", pulp.LpMinimize)

# Define the decision variables as integers
empl1 = pulp.LpVariable("empl1", lowBound=0, upBound=60)
empl2 = pulp.LpVariable("empl2", lowBound=0, upBound=60)
empl3 = pulp.LpVariable("empl3", lowBound=0, upBound=60)

# establish employee pay and productivity
empl1_pay = 15
empl2_pay = 18
empl3_pay = 22

empl1_prod = 7
empl2_prod = 8
empl3_prod = 10

# create auxiliary variables to capture piecewise OT function
empl1 = pulp.LpVariable("empl1_reg", lowBound=0, upBound=40)
empl2 = pulp.LpVariable("empl2_reg", lowBound=0, upBound=40)
empl3 = pulp.LpVariable("empl3_reg", lowBound=0, upBound=40)

# add auxiliary variables
y1 = pulp.LpVariable("y1", lowBound=0, cat="Integer")
y2 = pulp.LpVariable("y2", lowBound=0, cat="Integer")
y3 = pulp.LpVariable("y3", lowBound=0, cat="Integer")

# Objective function
problem += empl1_pay*empl1 + 500*y1 +  empl2_pay*empl2 + 500*y2 + empl3_pay*empl3 + 500*y3  , "Salary Cost"

# Constraints
# force ret and ot hours to add up to total hours
M = 1000

# big M method
problem += empl1 <= 20 + M*y1
problem += empl2 <= 20 + M*y2
problem += empl3 <= 20 + M*y3

problem += (empl1_prod * empl1 + 
            empl2_prod * empl2 + 
            empl3_prod * empl3 >= 600)

# Solve the problem
status = problem.solve()

# Output the results
print(f"Status: {pulp.LpStatus[status]}")
print(f"Optimal value of empl1: {empl1.varValue}")
print(f"Optimal value of empl2: {empl2.varValue}")
print(f"Optimal value of empl3: {empl3.varValue}")
print(f"Minimized salary cost: {pulp.value(problem.objective)}")

And here are the results of the run, looks like we can make 600 widgets while incurring $1,810 dollars in labor expenses. Employee 1 will be the only employee to receive benefits (employee 2 is just below the threshold of more than 20 hours).

optimization output - image by author
optimization output – image by author

Alright, now that we’ve tackled the issue of the additional benefits expense, let’s make things even more complicated by adding in over-time pay (1.5x the regular rate)! This is actually going to be pretty simple 😁 . To do this, we need to split our employee hour variables into two separate variables (regular hours and overtime hours). So now, instead of three employee hour variables, we have six. One regular hour variable and one overtime variable for each of the three employees. We create a constraint to bind the regular hours to no more than 40 (in pulp you can do this with the built in upBound parameter). We will then modify the objective function to reflect the change— I’ll just copy/past an excerpt from the code below:

new objective function with regular time and overtime split into two auxiliary variables
new objective function with regular time and overtime split into two auxiliary variables

Note that since overtime costs more than regular time, the solver will always max out the regular time before it starts adding overtime, so we don’t need a constraint to force this behavior. I also increased the production requirement to 1,000 so that our example would require some overtime.

Here is the code to solve the full problem with the benefits and overtime pieces:

import pulp

# Define the problem
problem = pulp.LpProblem("Staff_Management", pulp.LpMinimize)

# establish employee pay and productivity
empl1_pay = 15
empl2_pay = 18
empl3_pay = 22

empl1_prod = 7
empl2_prod = 8
empl3_prod = 10

# create auxiliary variables to capture piecewise OT function
empl1_reg = pulp.LpVariable("empl1_reg", lowBound=0, upBound=40)
empl2_reg = pulp.LpVariable("empl2_reg", lowBound=0, upBound=40)
empl3_reg = pulp.LpVariable("empl3_reg", lowBound=0, upBound=40)

empl1_ot = pulp.LpVariable("empl1_ot", lowBound=0, upBound=20)
empl2_ot = pulp.LpVariable("empl2_ot", lowBound=0, upBound=20)
empl3_ot = pulp.LpVariable("empl3_ot", lowBound=0, upBound=20)

# add auxiliary variables
y1 = pulp.LpVariable("y1", lowBound=0, cat="Integer")
y2 = pulp.LpVariable("y2", lowBound=0, cat="Integer")
y3 = pulp.LpVariable("y3", lowBound=0, cat="Integer")

# Objective function
problem += (empl1_pay*empl1_reg + 500*y1 + empl1_ot*1.5*empl1_pay +
            empl2_pay*empl2_reg + 500*y2 + empl2_ot*1.5*empl2_pay + 
            empl3_pay*empl3_reg + 500*y3 + empl3_ot*1.5*empl3_pay 
            , "Salary Cost")

# Constraints
# force ret and ot hours to add up to total hours
M = 1000

# big M method
problem += (empl1_reg + empl1_ot) <= 20 + M*y1
problem += (empl2_reg + empl2_ot) <= 20 + M*y2
problem += (empl3_reg + empl3_ot) <= 20 + M*y3

# constraint on minimum items produced
problem += empl1_prod * (empl1_reg + empl1_ot) + empl2_prod * (empl2_reg + empl2_ot) + empl3_prod * (empl3_reg + empl3_ot) >= 1000

# Solve the problem
status = problem.solve()

# Output the results
print(f"Status: {pulp.LpStatus[status]}")
print(f"Optimal value of empl1 reg: {empl1_reg.varValue}")
print(f"Optimal value of empl1 ot: {empl1_ot.varValue}")
print(f"Optimal value of empl2 reg: {empl2_reg.varValue}")
print(f"Optimal value of empl2 ot: {empl2_ot.varValue}")
print(f"Optimal value of empl3 reg: {empl3_reg.varValue}")
print(f"Optimal value of empl3 ot: {empl3_ot.varValue}")
print(f"Minimized salary cost: {pulp.value(problem.objective)}")

And here is the optimal output:

Optimization output - image by author
Optimization output – image by author

The optimal strategy is to max out employee 1’s work to 60 hours, use employee 2 up to the benefits limit and have employee 3 work a full week plus 2 hours. I hope this example has illustrated how we can linearize intercept jumps and changes in slopes so that they can be incorporated into LP problems.

Conditional relationships between variables

Sometimes we need to model conditional relationships between variables to solve Optimization problems. Here, I’ll dive into an example that I created for an article I wrote on simulation a little while ago (link below). That article was on simulating data and covered LP tangentially. I didn’t talk about the auxiliary variables it used at all – I’d like to take the opportunity to cover here what I didn’t there.

Simulated Data, Real Learnings: Scenario Analysis

Here’s how the problem is set up – imagine you are a developer planning a housing subdivision. You have multiple floorplans available to you, and you want to optimize your profit subject to specific constraints. Below is a table of the different floorplans with their key metrics:

Details for each floorplan - image by author
Details for each floorplan – image by author

Our main decision variables are integer values for each row in the table shown above. The integer values represent the number of homes we will build of each floorplan. This is represented by the list ‘x’ in the code at the end of this section.

Here are the three constraints for the problem:

  • We must have at least 6 different floorplans in our neighborhood
  • We need to have at least 10 homes < 2,000 sqft and 15 between 2,000 sqft and 3,000 sqft and 10 homes that are > 3,000 sqft
  • Lastly, we have 150,000 square feet of plots – each house needs 25% more square feet than the floor plan – we can’t build more houses than we have land!

The first constraint is going to need auxiliary variables, the other two can be handled directly with the input data and the primary decision variables (the "x’s").

Before we get into setting up this constraint, let’s build an understanding of how the main decision variable works. Each row in the table above will have its own ‘x’ decision variable. So we will have 15 ‘x’ variables – each will represent the number of houses that will be built for the corresponding floorplan. For example, X1 is the number of floorplan 1 houses that will be built. So, if X1 = 10, 10 houses of floorplan 1 will be built in the subdivision.

Okay, now onto the meat of the conversation – the auxiliary variables! We need a way to ensure that at least 6 of the ‘x’ variables are greater than 0. We do this by creating a binary auxiliary variable for each floorplan. So, we now have 15 binary ‘y’ variables. The next thing we need to do is tie the ‘y’s to the ‘x’s in a way that when an x variable is greater than 0, the corresponding y variable is set to 1. We do this by creating a constraint that x ≥ y – let’s build intuition on why this constraint meets our needs with the table below:

explaining the constraints that tie the auxiliary variables (y's) to the decision variables (x's) - image by author
explaining the constraints that tie the auxiliary variables (y’s) to the decision variables (x’s) – image by author

With that auxiliary variables set up, and new constraints introduced to tie the x’s to the y’s, we need to add one final constraint – the constraint that the y variables need to sum to at least 6. With that constraint in place, we are now ready to set up and run our optimization!

The code below shows how to set up and solve this problem in Python:

import pandas as pd
import numpy as np
from pulp import *

df = pd.read_csv(csv_path)
n = len(df)

# create dummy indicators to categorize home size
df['small_house'] = np.where(df['square_feet'] < 2000, 1, 0)
df['medium_house'] = np.where((df['square_feet'] >= 2000) &amp; 
                              (df['square_feet'] < 3000), 
                               1, 0)
df['large_house'] = np.where(df['square_feet'] >= 3000, 1, 0)

# Create a MILP problem
problem = LpProblem("Simple MILP Problem", LpMaximize)

# Define decision variables
x = LpVariable.dicts("x", range(n), lowBound=0, cat='Integer')
y = LpVariable.dicts('y', range(n), cat='Binary')

# Objective 
problem += lpSum(x[i]*df['gain_loss'][i] for i in range(n))

# constraints

# limit to amount of land available
# assume each floorplan takes 25% more land than its square footage
problem += lpSum(x[i]*df['square_feet'][i]*1.25 for i in range(n)) <= 150000

# requirements for diversity in home sizes
problem += lpSum(x[i]*df['small_house'][i] for i in range(n)) >= 15
problem += lpSum(x[i]*df['medium_house'][i] for i in range(n)) >= 15
problem += lpSum(x[i]*df['large_house'][i] for i in range(n)) >= 10

# Create at least 6 unique floorplans
for i in range(n):
    # if x is 0, y has to be 0
    problem += x[i] >= y[i]

# if x = 1, y coud be 0 or 1
# but because we want sum(y) to be >= 6, the optimization
# will assign y to be 1
problem += lpSum(y[i] for i in range(n)) >= 6

# solve problem
problem.solve()

# print solution
for i in range(n):
    print(f'{i + 1} : {value(x[i])}')

# print optimal profit
print("Optimal profit :", value(problem.objective))

And here is the optimal solution:

Optimal floorplan strategy - image by author
Optimal floorplan strategy – image by author

We can see here that our constraint worked! There are 6 floorplans selected to build in the optimal output.

‘Or’ logic

Often we need to access ‘or’ logic in linear programming. Raw ‘or’ logic is not linear, so we have to use auxiliary variables to linearize it. To do this, we will need to add one auxiliary variable for each condition in the ‘or’ logic and then add another auxiliary variable to tie them together.

Imagine you manage a microchip manufacturing plant. For national security reasons, the government offers a $2,000 grant for specific levels of chip production. To be grant eligible, you need to produce at least 42 units of chip A or 15 units of chip B a week. You can only get one grant in a week, so producing >42 of A and >15 of B won’t get you double grants. This is an example of ‘or’ logic because you get the grant if you meet the requirement for A or B.

To formulate this problem, we need to set up one binary auxiliary variable for each of the chip types. We will create a ‘grant_a’ variable and a ‘grant_b’ variable. We will then add the constraint that chip_a ≥ 45 * grant_a— Where chip_a is the total number of chip a’s produced. We will add the same constraint for chip b with the corresponding number chips required for a grant.

constraints with 'grant a' and 'grant b' auxiliary variables - image by author
constraints with ‘grant a’ and ‘grant b’ auxiliary variables – image by author

Lastly, we need a way to tie grant_a and grant_b together with ‘or’ logic. To do this, we will create one more binary auxiliary variable – ‘grant’ – and one more constraint.

constraint to tie the grant a and b auxiliary variables to the grant auxiliary variable - image by author
constraint to tie the grant a and b auxiliary variables to the grant auxiliary variable – image by author

This will force grant to be 0 if both grant_a and grant_b are 0. But, if grant_a and/or grant_b are 1, grant can take the value of 1 as well. The optimization will always set grant to 1 when it has the option (again, that is when grant_a and/or grant_b are 1), because setting grant to 1 will increase the objective function by $2,000!

Below is the example in Python code. Note that I added marginal profit to the objective function ($20 for chip A and $30 for chip B) and a material usage constraint to bind the problem.

from pulp import LpProblem, LpMaximize, LpVariable, LpBinary, lpSum

# Define the problem
problem = LpProblem("chip_manufacturing", LpMaximize)

# Decision variables
chip_a = LpVariable("chip_a", lowBound=0, cat="integer")
chip_b = LpVariable("chip_b", lowBound=0, cat="integer")

# set up three auxiliary variables
# one for if the factory qualfies for a grant through chip a production
# one for if the factory qualifies for a grant through chip b production
# and one more to indicate of at least one of the chip production levels qualifies for the grant
grant = LpVariable("grant", cat="Binary")
grant_a = LpVariable("grant_a", cat="Binary")
grant_b = LpVariable("grant_b", cat="Binary")

# Objective function
profit = 20 * chip_a + 30 * chip_b + 2000 * grant
problem += profit, "total profit"

# Constraints
# Material usage and availability
problem += 5 * chip_a + 12 * chip_b <= 200, "raw_material_constraint"

# Grant eligibility conditions
# If 100 units of chip_a are made, grant can be awarded
problem += chip_a >= 45 * grant_a, "grant_chip_A_constraint"

# If 75 units of chip_b are made, grant can be awarded
problem += chip_b >= 15 * grant_b, "grant_chip_b_constraint"

# if grant_a and grant_b are 0, force grant to be 0
problem += grant_a + grant_b >= grant, "grant_or_condition"

# Solve the problem
problem.solve()

# Output the results
print(f"Status: {problem.status}")
print(f"Profit: ${problem.objective.value():,.2f}")
print(f"Chip A: {chip_a.value():.0f} units")
print(f"Chip B: {chip_b.value():.0f} units")
print(f"Grant Awarded: {'Yes' if grant.value() else 'No'}")

The optimized solution is below:

Results of the optimization - image by author
Results of the optimization – image by author

Here, we can see that even though we make more per unit of raw material with chip A, we want to create 15 units of chip B to get the grant. Once we get the grant, we use the rest of the material to produce chip A. Looks like the ‘or’ logic checks out! We now understand how we can use auxiliary variables to solve LP problems with ‘or’ logic!

Conclusion

I hope that you have a fuller understanding of how using auxiliary variables can greatly increase linear programming’s flexibility. This article was meant as an introduction to auxiliary variables, with some examples of their use. There are other ways that they can be used that you can explore further. They are a little tricky to understand at first, but once you get the hang of them, a new world of LP optimization possibilities opens up!


Related Articles