to explore new technology in genetic engineering
Humans have turned to nature to find optimized systems, to solve complex problems. In the case of genetic algorithms, humans have simulated evolution in a program to optimize the weights of a neural network.
With the recent rise of genetic engineering, I believe that genetic algorithms can be bettered with some of these technologies. In this article, I will ofcus on the implementation of a gene drive
What is a gene drive?

Gene drives allow for a gene to be passed down to organisms of future generation, at a much higher rate. Look at the diagram above: The left side of the image represents the transmission of a gene from the parent to future generation. As half of the genetic information is from the father and half from the mother, there is a 50/50 chance that the offspring will inherit the gene.
However in the case of a insect, whose genetic information has been altered by a gene drive, all of the offspring of that insect will gain this gene, therefore allowing for the changes to affect all generations to come.
How does a gene drive work?
In normal CRISPR-Cas9 genetic engineering, the CRISPR protein is programmed to target a certain gene, that is replaced with a new sequence.
For the gene drive, the process is different. The CRISPR protein inserts itself into the genetic sequence, along with the information on the gene to target. When this part of the sequence is activated, the CRISPR protein will search for this gene, and replace all "wrong" (not the same gene that was inserted in the gene drive) with the "correct" (same gene that was inserted in the gene drive), causing transmission of that gene to be likely.
Now that you have a basic understanding of gene drives, let’s start by creating a simple genetic algorithm, capable of training neural networks.
The Genetic Algorithm:
A genetic algorithm, as I mentioned earlier, simulates evolution. A population of agents( in this case, neural networks) with certain genetic information (in this case, the weights of the neural network) are generated. An algorithm calculates the fitness value of each agent and calculates the fitness value (in this case, the loss). The top 20% of the population with the lowest loss then choose a random parents and create two offspring, each of which contains the genetic information of each parent. Repeat the cycle until an agent with a satisfactory loss value has been spawned.
Let’s try to create this in Python:
Step 1| Prerequisites:
import random
import numpy as np
from IPython.display import clear_output
def sigmoid(x):
return 1/(1+np.exp(-x))
The prerequisites are the basic dependencies that the program needs to function. Random is for the random generation of agents, numpy for the initialization and manipulation of matrices, and IPython display for removing the clutter on the screen.
For the sake of simplicity, the only activation function I will use for this project is the sigmoid function.
Step 2| The Agent Blueprint:
class genetic_algorithm:
def execute(pop_size,generations,threshold,X,y,network):
class Agent:
def __init__(self,network):
class neural_network:
def __init__(self,network):
self.weights = []
self.activations = []
for layer in network:
if layer[0] != None:
input_size = layer[0]
else:
input_size = network[network.index(layer)-1][1]
output_size = layer[1]
activation = layer[2]
self.weights.append(np.random.randn(input_size,output_size))
self.activations.append(activation)
def propagate(self,data):
input_data = data
for i in range(len(self.weights)):
z = np.dot(input_data,self.weights[i])
a = self.activations[i](z)
input_data = a
yhat = a
return yhat
self.neural_network = neural_network(network)
self.fitness = 0
self.gene_drive = []
def __str__(self):
return 'Loss: ' + str(self.fitness[0])
This is the start of the program, with the creation of the genetic algorithm class and the execute function.
The agent has a blueprint that contains the instructions for a network propagation. Within the agent’s init, a neural network class initialized, and its weights randomly generated based on a given matrix structure.
Step 3| Create Population:
def generate_agents(population, network):
return [Agent(network) for _ in range(population)]
This function, given the population size and network structure as parameters, generates a population of agents, with neural networks of randomly generated weights.
Step 4| Calculate Fitness:
def fitness(agents,X,y):
for agent in agents:
yhat = agent.neural_network.propagate(X)
cost = (yhat - y)**2
agent.fitness = sum(cost)
return agents
This is the basic fitness function for this genetic algorithm. It simply uses the MSE formula to calculate the loss.
Step 5| Selection:
def selection(agents):
agents = sorted(agents, key=lambda agent: agent.fitness, reverse=False)
print('n'.join(map(str, agents)))
agents = agents[:int(0.2 * len(agents))]
return agents
This part of the program is the selection algorithm, that sorts the agents in reverse order, based on their fitness values. It then exterminates all agents that are not in the top fifth of this list.
Step 6| Crossover:
def crossover(agents,network,pop_size):
offspring = []
for _ in range((pop_size - len(agents)) // 2):
parent1 = random.choice(agents)
parent2 = random.choice(agents)
child1 = Agent(network)
child2 = Agent(network)
shapes = [a.shape for a in parent1.neural_network.weights]
genes1 = np.concatenate([a.flatten() for a in parent1.neural_network.weights])
genes2 = np.concatenate([a.flatten() for a in parent2.neural_network.weights])
split = random.randint(0,len(genes1)-1)
child1_genes = np.array(genes1[0:split].tolist() + genes2[split:].tolist())
child2_genes = np.array(genes1[0:split].tolist() + genes2[split:].tolist())
for gene in parent1.gene_drive:
child1_genes[gene] = genes1[gene]
child2_genes[gene] = genes1[gene]
for gene in parent2.gene_drive:
child1_genes[gene] = genes2[gene]
child2_genes[gene] = genes2[gene]
child1.neural_network.weights = unflatten(child1_genes,shapes)
child2.neural_network.weights = unflatten(child2_genes,shapes)
offspring.append(child1)
offspring.append(child2)
agents.extend(offspring)
return agents
Two random parents, from the top 20% of the population are chosen. They then breed. How is this done:
- Their weights are flattened
- A random intersection point is found. This point is where the genetic information of one parent ends, and where the genetic information of one parents begins.
- Two offspring are created, that are then added to the list of agents. These child differ from each other, as they have a different intersection point.
This hopefully allows for the good traits of good parents to be passed on to their offspring.
Step 7| Mutation:
def mutation(agents):
for agent in agents:
if random.uniform(0.0, 1.0) <= 0.1:
weights = agent.neural_network.weights
shapes = [a.shape for a in weights]
flattened = np.concatenate([a.flatten() for a in weights])
randint = random.randint(0,len(flattened)-1)
flattened[randint] = np.random.randn()
newarray = []
indeweights = 0
for shape in shapes:
size = np.product(shape)
newarray.append(flattened[indeweights : indeweights + size].reshape(shape))
indeweights += size
agent.neural_network.weights = newarray
return agents
For every agent, there is a 10% chance that a mutation will occur. In this case, mutation refers to a certain weight value being replaced with a random float value. This is done by flattening the weights, finding a random weight to change, and then finally reshaping the weights to be reinserted into the agent.
Step 8| Gene Drive:
def gene_drive(agents):
for agent in agents:
if random.uniform(0.0, 1.0) <= 0.1:
weights = agent.neural_network.weights
shapes = [a.shape for a in weights]
flattened = np.concatenate([a.flatten() for a in weights])
target_gene = random.randint(0,len(flattened)-1)
if not(target_gene in agent.gene_drive):
agent.gene_drive.append(target_gene)
newarray = []
indeweights = 0
for shape in shapes:
size = np.product(shape)
newarray.append(flattened[indeweights : indeweights + size].reshape(shape))
indeweights += size
agent.neural_network.weights = newarray
return agents
Although this script inserts gene drives into the agents, if you look carefully in the code, you can already see elusions to the framework of the gene drive to influence other parts of the program. For example, in the crossover instructions, it actually explains how the gene drive is implemented.
for gene in parent1.gene_drive:
child1_genes[gene] = genes1[gene]
child2_genes[gene] = genes1[gene]
for gene in parent2.gene_drive:
child1_genes[gene] = genes2[gene]
child2_genes[gene] = genes2[gene]
It is that simple! If the gene exists in a parent’s gene drive, the child will change its own gene to the gene of the parent, that has the gene drive.
Step 9| Execution:
for i in range(generations):
print('Generation',str(i),':')
agents = generate_agents(pop_size,network)
agents = fitness(agents,X,y)
agents = selection(agents)
agents = gene_drive(agents)
agents = crossover(agents,network,pop_size)
agents = mutation(agents)
agents = fitness(agents,X,y)
if any(agent.fitness < threshold for agent in agents):
print('Threshold met at generation '+str(i)+' !')
if i % 100:
clear_output()
return agents[0]
Paste this final bit of code into the function and the function should run when called.
X = np.array([[0,0,1], [1,1,1], [1, 0, 1], [0, 1, 1]])
y = np.array([[0,1,1,0]]).T
network = [[3,20,sigmoid],[None,1,sigmoid]]
ga = genetic_algorithm
agent = ga.execute(10000,1000,0.1,X,y,network)
weights = agent.neural_network.weights
agent.fitness
agent.neural_network.propagate(X)
The task for the neural network is simple. The network with the structure (3,20,1), with sigmoid activation functions on each layer. For every new generation of population, 10000 new agents are generated and this is repeated for 1000 generations.
Conclusion:
The gene drive genetic algorithm actually was worse than the normal genetic algorithm. My hypothesis is that the gene drive would be applied to earlier generations of agents, that would, by definition, not be as good as future generation.
I then decided to implement a threshold in which only a low loss value will have a gene drive randomly attributed to a gene in their genome.
New Code:
def gene_drive(agents):
for agent in agents:
if agent.fitness <= 0.1:
weights = agent.neural_network.weights
shapes = [a.shape for a in weights]
flattened = np.concatenate([a.flatten() for a in weights])
target_gene = random.randint(0,len(flattened)-1)
if not(target_gene in agent.gene_drive):
agent.gene_drive.append(target_gene)
newarray = []
indeweights = 0
for shape in shapes:
size = np.product(shape)
newarray.append(flattened[indeweights : indeweights + size].reshape(shape))
indeweights += size
agent.neural_network.weights = newarray
break
return agents
This worked much better and yielded a much better result than the normal genetic algorithm.
My links:
If you want to see more of my content, click this link.