Taking Your Optimization Skills to the Next Level

Soft constraints, linearization, multi-objectives and more

Published in

Towards Data Science

7 min readMar 27, 2022

If you are an optimization beginner, I would recommend you to start with the why and the how, before returning to this post. Here I provide additional information by explaining common practices when problems are a bit more complicated than basic toy examples.

Reading the solver progress

By adding tee=True to opt.solve(model) the progress of the solver prints during optimizing. You can also choose to write the log to a file by specifying a logfile='fn.log' . The log gives information about the run and can be valuable for setting limits.

Above you can see the output of a run from Gurobi. It is a maximization problem. The most important things you need to understand:

Before every row you see an H or an empty spot. This tells whether the current best solution (Incumbent) has improved compared to the previous best solution.
Incumbent means current best solution. The maximum value of the objective (or the minimum in case of a minimization problem).
BestBd means the maximum value the objective will ever reach. The solver is not sure whether it is possible to reach this number, but it will definitely not exceed it.
Gap is the gap in percentages between the Incumbent and the BestBd. This shows how much space there is to improve. This is an important value and will be discussed in the next section.
Time shows the total time of the process until this row.

Output logs from other solvers look different, but most of them show at least the current best solution, the maximum best solution, time, iteration and nodes.

MIP gap and time limit

If you are working on large optimization problems with many variables and constraints, it can take hours or even days for the solver to come up with the optimal solution of the problem. Most companies want to make decisions fast so you need to set limits for the solver. This is where the MIP gap and the time limit come into play. Let’s take a look at the log again:

After 311 seconds, or just over 5 minutes, the solver has made the biggest improvements. After that the improvement flattens, as you can see in the plot below:

Minutes and gap, after 5 minutes the improvements become smaller and smaller.

By talking to stakeholders and analyzing logs you can discover a good time limit according to your specific problem. There is a trade off between time and performance. Longer solving time means better performance. For the process we see here you might want to set the time limit to 600 seconds (10 minutes). If you use Gurobi and pyomo, you need to add opt.options['TimeLimit'] = 600 to your code. The solver will stop after 600 seconds and keep the best solution.

You can also choose to set the MIP gap. Remember the gap from the previous section? This shows the space for improvement (gap between Incumbent and BestBd). If you want the solver to stop if the gap is lower than 1 percent, you should add opt.options['MIPGap'] = 0.01 to your code.

It is a best practice to set the MIP gap and the time limit. If one of the conditions is met, the solver will stop and returns the current best solution.

Multi-objectives

Sometimes your problem can’t be formulated in one objective function. In that case you have to deal with a multi-objective. The first step here: try to change your problem to have a single objective, because it makes things a lot easier. If your objectives have the same nature, like money, you can add them or subtract one from another and end up with a single objective. Or you can try to replace one objective by constraints. If these methods don’t work, you are dealing with a multi-objective.

A multi-objective means multiple best solutions, because you are minimizing (or maximizing) multiple things you can’t compare easily. Let’s say you are minimizing two objectives, all combinations with low objective values are possible best solutions (blue dots):

Example of an objective space with two objectives and the Pareto Front. Image by author.

Every solution higher and more to the right is worse than the blue dots. These blue dots are called the Pareto Front.

If you have to deal with multi-objectives, take a look at the pymoo framework.

Absolute values

Sometimes the goal of an optimization problem is getting as close as possible to a certain point. It doesn’t matter if the distance is negative or positive, it should be as small as possible. In these kind of scenarios, the easiest way would be to minimize the absolute value of the distance. But optimization frameworks don’t like the absolute function. It’s hard to perform standard optimization procedures on absolute functions, because they are not linear and not continuously differentiable.

How should we handle absolute values? It’s pretty easy actually! We need to add two dummy variables and define them as the positive and the negative distance. If we want to get as close to 30 as possible, the solution looks like:

# set variables
model.x = pyo.Var()
model.posdelta = pyo.Var(bounds=(0, None))
model.negdelta = pyo.Var(bounds=(0, None))# constraint to define the posdelta and negdelta
model.define_deltas = pyo.Constraint(expr= model.x + model.posdelta - model.negdelta == 30)# minimize the delta values in the objective
model.obj = pyo.Objective(expr= model.posdelta + model.negdelta, sense=pyo.minimize)

So if model.x is equal to 35, model.negdelta will be 5 and model.posdelta will be 0. On the other hand, if model.x is equal to 25, model.posdelta is equal to 5 and model.negdelta will be 0. By summing model.posdelta and model.negdelta we get the absolute distance between x and 30.

Careful! If you don’t use the delta variables in the objective, strange behavior can take place. The solver can choose to make both variables greater than zero. Because 25+10–5=30, in this case model.posdelta is 10 and model.negdelta is 5. So make sure you add them to the objective or add constraints to make sure only one of the variables can be greater than zero.

The third example in this article uses this technique.

Linearization

Imagine you are working on an optimization problem, and you have two variables x1 and x2 . If x1 is higher than zero, x2 should be equal to zero. And if x2 is higher than zero, x1 should be equal to zero. Or more easy formulated: only one of the two may be greater than zero. How can we specify constraints that will satisfy this? Easy, you might think. If we multiply x1 and x2 , the solution should be zero (because x1*x2=0 can only be satisfied if at least one of the two is equal to zero). But this is a problem. If we multiply two variables, the problem is not linear anymore. Meaning we have to deal with bad stuff like global and local minima etcetera…

Is there another solution?

Yes there is! We need to add two binary dummy variables to the problem to solve this. If x1_dummy is equal to zero, x1 is equal to zero, and if x1_dummy is equal to one, x1 is greater than zero (same holds for x2_dummy and x2 ).
M is the highest possible value for x1 and x2 (aka big M). Constraints we need to add:

# only one of the two dummies can be equal to 1
x1_dummy + x2_dummy <= 1# x always smaller than M*dummy, if dummy is 0, variable is 0
x1 <= M*x1_dummy
x2 <= M*x2_dummy

Now the problem is still linear, and only one of x1 and x2 can be greater than zero.

This is just one example of linearization.

Soft constraints

Did you ever optimize a problem that didn’t have a possible solution? This happens when your boundaries or constraints are so tight, no solution can meet every requirement. It can be frustrating, especially when a process is waiting for the answer of the solver.

How to take care of this? Soft constraints might be one possible solution. You can stretch the boundaries of a constraint by adding a penalty variable x_pen to a normal constraint.

Imagine you want x1+x2 >= 10. If you change this constraint to x1+x2 >= 10-x_pen the solver has flexibility to choose x_pen greater than 0. This means x1+x2 might become smaller than 10 (so don’t add a penalty to a hard constraint). By adding x_pen to the objective function, the solver will try to keep x_pen as low as possible.

Conclusion

I hope this article helped you to become a better optimizer. Only a tip of the iceberg but these issues are common and will help you to understand optimization better and solve real life cases.

See you next time!

Why Every Data Scientist Should Learn Mathematical Optimization

Data science courses are focusing on data visualization, feature engineering, data processing, (un)supervised learning…

towardsdatascience.com

How to Handle Optimization Problems?

Easy examples with solutions and code.