Evolutionary Computation Course
Hello and Welcome back to this full course on Evolutionary Computation! In this post we will start Unit 5 of the course, Evolutionary Programming. In the previous post we started and finished up Unit 4) Genetic Programming by explaining the main differences between standard genetic algorithms and genetic programming, in addition to evolving the underlying equation for a Time Series Analysis problem. You can check it out here:
In this post we will cover all the major differences between Evolutionary Programming and standard genetic algorithms; namely, the mutation and selection operators for survival. If this is your first time seeing this course please check out these two previous posts that introduce Evolutionary Computation so that you can understand the concepts being presented here:
Table of Contents
- Differences between Evolutionary Programming and standard Genetic Algorithms
- Mutation Operators
- Distributions for Mutation
- Selection Operators
- Self-Adaptation for Updating Strategy Parameters
- Example Implementations
- Conclusion
Differences between Evolutionary Programming and standard Genetic Algorithms
Despite the similarity of the names, Evolutionary and Genetic Programming are nothing alike. During the early years of the birth of genetic algorithms as a whole, there were many different names and methodologies being used to describe Evolutionary Algorithms floating around in literature: genetic algorithms, evolutionary programming, evolutionary strategies, and differential evolution. Each one of these algorithms was in essence an evolutionary algorithm, but in some way they were different from each other due to certain characteristics, therefore authors had to distinguish between them by choosing different names.
Evolutionary programming was actually one of the first genetic algorithms ever introduced. It originated from basic AI principles and wanted to model AI behaviors. The emphasis in this paradigm is on behavioral evolution rather than genotype. Because the genotype is not evolved, no crossover method is used between parents, only mutation. Individuals are created only by mutating the parents, where the amount of mutation is known as the behavior.
In addition, no raw fitness function is used, instead favoring relative fitness. Relative fitness value is used instead of raw fitness value to measure how much better individuals perform relatively to those around them. For example, see the table below where we have a tournament of five randomly selected individuals, along with their raw and relative fitness values (assuming maximization in this table). Relative fitness is calculated by how many times the current individual has a better raw fitness than those around them:

The basic algorithm overview of Evolutionary Programming is given in the diagram below. We first initialize our population, entire a loop where we mutate each individual to create the offspring, then pool the offspring and parents together and use tournament style selection to calculate the relative fitness. After this process we use some type of selection process to choose who survives:

Mutation Operators
As stated before, evolutionary programming is focused on evolving the behavior of the individuals, not the genotype, thus mutation is the only reproduction operator used, no crossover. Because of this, the mutation operator needs to be able to introduce new genetic material efficiently enough to accommodate for the lack of genetic sharing between individuals through crossover. For mutation operators there are three main types:
- Non-Adaptive (Static)
- Dynamic
- Self-Adaptive
Up to this point we’ve covered static and dynamic mutation operators. Static simply means that the domain of our mutation stays constant or static, an example would be mutating by some max bound, between [-1, 1]. Dynamic refers to slowly decreasing this max mutation bound with each generation in order to encourage exploration in the early generations and exploitation in the latter. Note that we’ve implemented dynamic mutation through the use of our logistic decay algorithm back in Unit 3) Genetic Algorithms Part 1. On the other hand, Self-Adaptive mutation refer to operators where the mutation is not static or dynamic, but has the ability to increase, decrease, or stay the same at any given moment, all depending on the relative fitness from the environment. We will cover self-adaptive strategies later on in the post. Here below we have the general equation for mutating an offspring:

As we can see above, the offspring is created by adding some value, delta x to the parent, x. The value of this delta x is calculated by the following equation:

As we can see above, delta x is calculated by taking a step size, nu, into the direction of a randomly generated value from the probability distribution, capital eta with strategy parameters sigma. Don’t get overwhelmed by the notation, we’ve done this before. In Unit 3) Genetic Algorithms, we performed mutation two different ways, by adding a small randomly generated value from either a Gaussian or Uniform distribution. For gaussian, our probability distribution would be the normal gaussian and our strategy parameter would simply be the standard deviation, while our step size would have been the mean. For our mutation based off uniform mutation, our probability distribution would be the uniform and our strategy parameter would simply be the bounds of the distribution, while our step size would have been one.
Instead of using the same strategy parameters for the entire generation, we can individualize a set of strategy parameters for each individual so that they have the ability to characterize their mutation range. Because each individual might need to be mutated by different values, we can change the chromosome to include not only the variable values of the individual but also their own strategy parameters. An example of this would be poor individuals would have higher standard deviations for mutation than those with better fitness. This can be seen below:

In evolutionary programming, the ‘behavioral’ mechanism being evolved are these strategy parameters, meaning, in addition to the variables of the individual being evolved, so too are their strategy parameters.
Distributions for Mutation
There are many distributions used for mutation in evolutionary programming; the most common are the following:
- Uniform
- Gaussian
- Cauchy
- Levy
- Exponential
- Chaos
- Combined Distribution

Because we introduced a few distributions that you might not have heard of, I’ve decided to go over the basics for each one. The Cauchy distribution is like that of the normal gaussian distribution except the tails are very wide, we can see an example of this in the top left graph where we have plotted 95% of the density of the distribution. The Levy Distribution, which looks very similar to the Chi Square distribution, models non negative values and has a large hump. The Double Exponential Distribution is similar to the Cauchy and normal in that it has a similar bell curved shape, except it has a peak at the mean and longer tails. For Chaos Distributions, there many different types, but above is an example using the sin wave function. In EP, the most common Combined Distribution is the Gaussian paired with the Cauchy.

Selection Operators
As stated before, all parents are mutated by some value dependent upon their own unique strategy parameters to create offspring. These offspring are pooled together with their parents and some type of selection process is used to choose who survives. Relative fitness is used over raw fitness, where some number of randomly selected competitors are pooled together and their relative fitness values are computed using the equations below (assuming maximization):


The first equation we’ve already performed, simply its the sum of the number of times an individual has a larger (assuming maximization) raw fitness score than the rest in the tournament. However, the second equation is a different way of calculating relative fitness by creating a proportion between the fitness values and seeing if that proportion is larger than a random uniform between 0 and 1.
After the relative fitness values are calculated, we use some type of standard selection procedure to select from the offspring and parents who will survive. This can either be done by tournament selection where ties are resolved by Boltzmann selection, proportional selection, or elitism (where elitism simply refers to choosing the best half instead of some top percentage).
Self-Adaptation for Updating Strategy Parameters
Now we are going to talk about three very common self adaptation operators for strategy parameters, additive, multiplicative, and lognormal. In additive (shown below), the strategy parameter is updated by adding the square root of some function times a random normal value with mean 0 and standard deviation 1. This function returns the strategy parameter if it is greater than zero and returns some lower bound gamma if the strategy parameter is negative:

For multiplicative methods (shown below), you multiply the starting strategy parameter by the value shown below on the right hand side, where gamma 1, 2, 3 are control parameters that the user picks, and n sub t denotes the maximum number of generations allowed.

Lastly, the most common method is the lognormal operator. In this scenario, the offspring are created as below:

Where capital eta, our probability distribution, is the gaussian normal centered at zero with standard deviation of 1 with a step size of sigma. With each iteration, we update our strategy parameter, sigma, through the following equation

Now in practice, it has been noticed that the lognormal operator has the tendency to converge too fast with smaller parameter values, so it is usually accompanied along with a minimum value such that if it dips below the minimum value it switches to this value.
Example Implementations
Here we will cover three common example implementations of EP algorithms:
- Classical Evolutionary Programming
- Fast Evolutionary Programming
- Exponential Evolutionary Programming
First, the classical evolutionary programming algorithm uses gaussian mutation for mutating the genome, along with lognormal adaptation for updating the strategy parameters, and elitism for selection for survival.
Second, the fast evolutionary programming algorithm uses the Cauchy distribution for mutating the genome with a scale parameter of 1, along with lognormal adaptation for updating the strategy parameter, and elitism for selection for survival.
Lastly, the exponential evolutionary programming algorithm uses the double exponential distribution as the mutation operator where the strategy parameter and selection process can be variable.
Conclusion
In conclusion, evolutionary programming differs from standard genetic algorithms in that the focus is on the behavior of individuals, thus no crossover is used in favor of mutation. Behavior is modeled by strategy parameters from distributions. In addition, instead of using Boltzmann selection to choose whether or not the children replace the parents, or the children always replacing the parents, the children and parents are pooled together and fight for survival based upon their relative fitness within the entire generation, which is calculated by how many opponents they are better than in a randomly selected tournament. Relative fitness reveals how well individuals perform relatively to the current generation. One can use standard selection methods for survival after calculating relative fitness.

In terms of application, Evolutionary Programming is most commonly used in constrained environments such as scheduling and routing, power systems, and designing systems. Stay tuned for the next post where we will wrap up Unit 5 by tackling a real constrained non linear programming problem known as the Pressure Vessel Design Problem, which we can see a basic diagram above.
Unit 5 Application) Optimizing Constrained Non-Linear Pressure Vessel Design Problem