
There are many different ways to solve mathematical optimization problems. You can use greedy algorithms, constraint programming, mixed integer programming, genetic algorithms, local search, and others. Depending on the size and type of the problem, and the solution quality desired, one technique may work better than the other.
This post gives an overview of different heuristics for solving discrete optimization problems. First, I explain the three components you need to describe an optimization problem mathematically. Then I will give an explanation of some common and well-performing search heuristics.
Optimization Problems
Here is a short refresher on Mathematical Optimization. In order to define an optimization problem mathematically, you need the following three components: decision variables, constraints and an objective.
Let’s take a look at an easy example. You are a small post delivery office, and you earn a different amount of money for every package you deliver. The delivery van has a limited amount of space. The delivery office wants to deliver the highest total value possible per delivery round. What are the packages you should deliver?

Decision variables
The decision variables can take different values. The goal is to find the best values for the decision variables. What are the best values? That depends on the objective and the constraints. In the post delivery example, every package has a binary decision variable. The variable is 0 if the package will not be delivered, and 1 if the package will be delivered.
Constraints
Constraints are things that are not allowed or boundaries, by setting these correctly you are sure that you will find a solution you can actually use in real life. A constraint in the post delivery example: you can’t deliver all packages, because the delivery van has limited space. If the maximum space of the van is equal to 600, you should add a constraint that makes sure the packages selected will not exceed this limit.
Objective
The objective is the goal you have in the optimization problem, this is what you want to maximize or minimize. The goal of the delivery office is to select the most valuable packages for delivery. In the objective, you want to maximize the total value of the selected packages.
Here’s the complete description of the example:

If a problem is well defined (i.e. there exists a feasible solution), there always exists at least one optimal solution for the optimization problem. It can be hard to find one of these optimal solutions, especially when the problem is large and complex. Not all techniques discussed in this post are guaranteed to find the optimal solution. But, if you apply them correctly to large problems, they can be faster than a solution that uses constraint or mixed integer programming techniques.
Optimization Techniques
There are different heuristics you can use to solve optimization problems. Here, I will explain some of them.
I assume you are already familiar with brute force, which is trying all the possible solutions and keeping track of the best one. Another technique you may know is dynamic programming, where the problem is broken down into smaller subproblems. If you aren’t familiar with dynamic programming and brute force, this post explains them. Brute force and dynamic programming are perfectly fine to use when your problem is small. When problem size increases, they will take way too long and are inefficient. Brute force and dynamic programming aren’t heuristics, because they don’t reduce the search space. You can decide to combine brute force with local search or genetic algorithms, by systematically testing all the possible solutions (brute force) of a smaller subset of solutions.
Greedy Algorithms
To solve the delivery office problem, an easy way to start is with greedy algorithms. They give a baseline and offer solutions really fast. The idea behind greedy is that you choose a package until the delivery van is full. You don’t choose any package, but you can for example start with the most valuable package. You continue doing this until the van is full. Let’s say the maximum capacity of the van is 60, here are the packages we would select:

There are other ways to decide the next package. By dividing the value and the size of every package, you get a value per size unit per package. You could describe this as the value density. By selecting the packages with the highest value per size unit, it’s possible to come up with an even better solution.

An advantage of greedy is that it’s fast. But for more complex problems, in most cases the solution will be far from optimal.
Local Search
The next technique is an interesting one. Local search is pretty intuitive. It works as follows: you start with a solution, and you are going to improve that solution by making local moves. That means you apply a small move to the current solution that improves the objective. You continue to apply local moves until there are no more moves available that improve the objective.
Let’s look at the delivery example again. We can start with a solution:

A local move could be to swap a selected package with a package that is not selected. We keep an eye on the capacity constraint and we try to keep that satisfied for every local move. An example of a move could be:

With this move, the new objective value is 115. We remove the package with the lower value from the selection and add the package with the higher value, while still having a feasible solution.
All the possible solutions you can reach by applying one local move, is called the neighborhood of the current solution.
You can’t exceed the capacity limit of the van. So in this case, if a package is larger than any other package, even if the value is high, we will never select this package!

That’s a downside of local search. You can get stuck at local optima:

There are ways to overcome this problem. You can choose to swap multiple packages at once and make that equal to one move. By doing this, your neighborhood is increased and you can reach more solutions.
You can also decide to start with multiple solutions instead of one. Then you repeat the swapping procedure for every solution, this is called iterated local search.
Another approach is to choose actions that make the objective worse with a certain probability:

If the temperature parameter is large, there is a high chance of accepting a degrading move. If the temperature is small, this chance is low. Simulated annealing uses this probability. It starts with a high temperature, and decreases it progressively. This means that in the beginning, you are performing a random walk across solutions. When the temperature decreases, the search becomes local. Simulated annealing had excellent performance on hard benchmarks.
Tabu search is the final technique I want to mention for escaping local optima. The idea of tabu search is to keep track of solutions you already visited. It’s not allowed to go back to them again. It can be costly to keep all the previous solutions in memory. You can instead decide to store transitions or keep the tabu list of a fixed size.
It’s possible to combine techniques like iterated local search, simulated annealing and tabu search.
Genetic Algorithms
You can also decide to use genetic algorithms. The key idea of genetic algorithms are to reflect the process of natural selection. A solution to the problem is called an individual. First, you start by generating the initial population. This population exists of individuals. Then, you calculate the fitness of every individual. You can compare the fitness function with the objective. The fittest individuals (they have the best objective values) are selected for reproduction in order to produce offspring.
For our delivery example, every package is a gene which can take the value of 0 or 1. Our initial population with four individuals could look like this:

Now we select the fittest individuals (highest objective values) to produce offspring. In our case individual 2 and 4 have the highest fitness values. There are different ways possible to produce offspring, a common way is to choose a crossover point at random and exchange genes until this crossover point.

The next step is mutation. We flip some of the genes with a low random probability, in our case let’s take 0.14 (1 divided by 7, 7 is the number of packages). This is an important step because we want to maintain diversity and prevent premature convergence. We are going to mutate offspring 1:

Now, we will compute fitness for the new individuals. The population has a fixed size. The individuals with the lowest fitness values die.

Here is the complete algorithm:
- Create the initial population.
-
Repeat until convergence:
- Selection of the fittest individuals.
- Choose a crossover point to create offspring.
- Mutate genes.
- Compute fitness, some of the individuals die.
With genetic algorithms, it’s also possible to get stuck at local optima. There are multiple ways to overcome this. You can create subsets of the initial population and randomize during the selection phase. This prevents using the same population again and again during selection. Another way to avoid local minima is to give an extra reward to individuals that are surviving longer and/or to individuals that are more unique than others, because they can help in finding a generalized solution.
Hybrid Techniques
The final way discussed here for quickly finding high quality solutions is through combining different techniques.
An example is large neighborhood search, where local search is combined with constraint programming (CP) or mixed integer programming (MIP). Downsides of CP and MIP are that they can have difficulties with large problems and need a lot of time to get to the optimal solution. By combining local search with CP or MIP, you have the best of both worlds. You can solve small subproblems with CP or MIP, and select a new subproblem with local search.
The steps for large neighborhood search are:
- Start with a feasible solution to the problem. You can find a solution using any technique you like.
-
Repeat until a criterion is met:
- Select a neighborhood (part of the problem).
- Optimize this subproblem (with CP or MIP).
During solving you keep track of the best solution. The neighborhood can be defined by, for example, fixing a set of variables.
Another example of a hybrid approach is the memetic algorithm. Memetic algorithms combine genetic algorithms and local search.

Conclusion
In this post you’ve seen different heuristics to solve mathematical optimization problems. Hopefully you can find solutions to optimization problems quickly with local search, genetic algorithms or a hybrid approach! There are other interesting and well-performing search heuristics, like particle swarm optimization, colony ant optimization and stochastic tunneling.
Related
Five ways to combine Mathematical Optimization and Machine Learning