The world’s leading publication for data science, AI, and ML professionals.

Building an iterative solver for linear optimization under constraints using geometry

Understand how linear optimization works by building your own iterative solver, using geometry.

Photo by Diggity Marketing on Unsplash
Photo by Diggity Marketing on Unsplash

A powerful tool for the data scientist

Optimization is a common tool in Data Science, and one of the most frequently used approaches is the linear one.

Even though not every optimization problem can be formulated in a linear way, in many cases, it’s possible to rewrite it using a linear formulation.

In this post, we are going to write from scratch a solver for linear Optimization under constraints, using an iterative method.


The ingredients

There are three main things to consider when formalizing an optimization under constraints:

  • A set of variables: those are the quantities that we want to determine during the optimization.
  • An objective: it’s a formula that combines the variables and expresses some value that we want to either maximize or minimize. As we have restricted ourselves to the case of linear optimization, this objective must be a linear combination of the variables.
  • A set of constraints: These constraints will restrict the possible values for the variables. Once again, these constraints must be expressed using linear formulas.

As an example, the variables can be the height H, weight W, and the length L of a rectangular cuboid. The objective can be the sum of the dimension of this solid : W + H + L , and we can add some constraints on the variables, like W < 3 and L < 4.

Mathematically speaking, this can be written as:

System by the author
System by the author

Solving this problem is pretty easy. It can be done manually. The solution consists in maximizing Wand L, and deduce H, i.e. : W = 3, L = 4 and H = 4, hence the sum of the cuboid dimensions is S =3 + 4 + 4 = 11.

Visualizing the problem

Before going into the mathematical details, let’s try to graphically visualize our problem so that we get some insight.

As we are facing a 3-dimensional problem, i.e. we have three variables : W, L and H , we can use 3D geometry.

The first two constraints are easy to visualize. The first one, restrict the solution space to the 3D space where all 3D points are below the plane defined by W = 3. The second one restricts the solution space to all the 3D points that are below the plane defined by L = 4.

The third one is a bit more complex to handle, but if we rewrite it using the following mathematical formula: <(W, L, H), (1, 1, 0)> <= 8, where the operator <., .> is the dot product, it appears that this constraint projects the dimension of the cuboid onto the vector (1, 1,0), and that the sum of the component of this projection must be below 8.

Another way to interpret this projection with a minimal distance is to consider this vector (1, 1, 0) as the (unnormalized) normal vector to a plan. Hence, once again, this third constraint can be interpreted geometrically as a plan.

The figure below illustrates this point, by drawing the plans as a circle with a normal vector.

The solution space, with the three constraints represented as boundary plans. Figure by the author.
The solution space, with the three constraints represented as boundary plans. Figure by the author.

In the figure above, A = (3, 0, O) and the normal vector u = (1, 0, 0)materializes the first constraint W <= 3;B = (0, 4, 0) and v = (0, 1, 0) materialize the second constraint, whereas C = (0, 4, 4).


Solving the problem

An iterative approach

Let’s first imagine that we are trying to optimize the objective W + H + L without constraint. Even though this objective is linear with respect to W, H and L, we can nonetheless use the steepest descent method.

In a linear case, without constraint, using this kind of method is not relevant, all the more that there is no solution as the system is unbounded. However, we will see in the next section how to integrate constraints in this iterative approach.

This steepest descent method is simple and purely geometric: the idea is to compute the gradient of a function, and move slightly in the direction of the gradient. Here the gradient is constant and is the vector (1, 1, 1). The code below illustrates this method:

In this code, weights contains the gradient, as in a linear formula, the gradient is simply the weights of the linear combination. x_0 is the initial value, and delta defines the size of the step.

Note that the stopping criterium is based on convergence. In this case, without constraint, the system will never converge. Objective will infinitely increase.

Detect unsatisfied constraints

At this step, what we need to be able to do is determine if a point is below a plane. A plane can be defined with a point A and a normal vector n. By convention, we’ll say that a point is above a plane if it’s located in the half-space pointing in the direction of the normal vector. On the opposite, a point is considered under a plane if it’s in the other half-space.

The figure below illustrates that with the plane defined by W = 3:

The green plane is defined by point A and the normal vector. Plot by the author.
The green plane is defined by point A and the normal vector. Plot by the author.

Given these two pieces of information, it’s possible to know whether a point is below or above a plane with calculations involving the dot product.

Let’s consider the two points Aa and Ab in the figure below :

Ab is below the plane, whereas Aa is above, with respect to the normal vector. Figure by the author.
Ab is below the plane, whereas Aa is above, with respect to the normal vector. Figure by the author.

Remember that when doing a dot product between a vector and a normalized vector, the resulting scalar will be positive if both vectors point in the same direction, whereas it will be negative if they point in opposite direction.

Hence, in order to know whether Ab is below the plane or not, it is only necessary to perform a dot product between the normal vector n and the vector AAb that connect A and Ab.

As can be seen in the figure above, these two vectors point in the opposite direction, hence Ab is below the plane as defined here.

On the opposite Aa is above the plane.

The snipped below explains how to do that in python using NumPy:

Enforcing constraints

With this formal definition of linear constraints as a plane, defined by a point and a normal vector, we have the necessary tool to detect unsatisfied constraints.

What we need now is a way to enforce constraints. Geometrically speaking, this means that we need a way to move the point considered onto the plane. I.e. we need to project the point on the plane.

Once again, the dot product can help us. As mentioned above, the sign of the dot product between the normal vector nand the vector connecting A and the point of interest P inform us of the position of P relatively to the plane.

But more importantly, if the normal vector n has been normalized, this dot product gives us the distance between the plane and the point P. Knowing that, projecting P onto the plane simply boils down to move P in the opposite direction of the vector n by the quantity given by the dot product.

A picture is worth a thousand words, so let’s have a look at the figure below:

Projecting a point P on the plane defined by A and n. Figure by the author.
Projecting a point P on the plane defined by A and n. Figure by the author.

The point P' has been obtained by projection P on the normal vector. Projecting P to get its projection Proj is the simply done by translating P using the vector P'A.

Written in python, this gives:

Please note this function project is really a projector mathematically speaking, as project(project(P)) is equal to project(P). This is the mathematical definition of a projector.

Putting it all together

We now have three tools at our disposal:

  • An iterative method to converge to an optimum
  • A function to detect that a constraint is well respected
  • A projector to force the respect of the constraint if necessary.

The algorithm to maximize our objective under constraints is pretty simple:

  1. We use the steepest descent to move toward the optimum, by following the gradient
  2. We ensure that constraints are respected
  3. If not, enforce the constraint
  4. Iterate until we have converged, i.e. until the point stops moving.

In python, this gives:

To ease the description of problems, we have introduced a Constraint object, that defines a constraint with a vector, a point, and a gap. The gap indicates the distance of the constraint with respect to the plane.

We also introduced a function, normalize_constraint, to ensure that the normal vector is unitary, i.e. we ensure that its norm is 1.0.

As you can see, this basic solver converges toward the optimal objective and propose the expected solution.

Ok, this works in 3D, but what about higher dimensions?

Up to now, we have been using 3D geometry, to ease understanding. However, everything we have done above can be generalized immediately to any n dimensional spaces, where n ≥ 3.

This can be achieved by seeing that constraints can be seen as hyper plans of the n-dimensional space, i.e. as n-1 dimensional objects.

If for instance, we work with an hypercube of dimension 4, instead of our 3D cuboid, we get the following code, where only the problem definition has changed:

We have added another dimension: Z , and put a simple constraint on it, i.e. Z <= 5. Hence the maximal objective is now 3 + 4 + 4 + 5 = 16 , which the solver does find.

https://www.buymeacoffee.com/guillaumes0
https://www.buymeacoffee.com/guillaumes0

Conclusion

We have presented in this article one geometrically and easily understandable method for solving a linear problem under constraint.

If you look at the literature on the subject, you’ll see that there are many other ways to solve this kind of problem. In particular, there is another class of methods, based on the simplex method that works completely differently.

This kind of method has some advantages, one of them being the accuracy, as on the opposite to the method presented here, it does not converge to the solution but finds it.

It’s also a class of methods that is less subject to ill-conditioned problems.

However, iterative methods such as the one presented here have the advantages of being more efficient and allowing to control a trade-off between accuracy and time consumption.


Related Articles