AI Teaches itself to play a game

Built a simple game using Pygame & applied NEAT (NeuroEvolution of Augmenting Topologies) algorithm to train AI.

Sanjay.M
Towards Data Science

--

This post is all about teaching AI how to play a simple game which I built using pygame library. The game is, the ball should keep on rolling through the gap between the pipes, if the ball hits any of the pipe then we lose. As and when a ball successfully passes through the gap between the pipes, the score will be increased by 1.

The Gif image above shows the training process of how neural network improves generation after generation and the progress status can be seen the in game window using the below three values.

  • Score: Number of points scored/number of pipes successfully crossed.
  • Gens: Number of generations/mutations the algorithm is taking to learn.
  • Balls: Indicates the number of balls alive in the current generation.

The complete code to build the game interface and the related files are in my Git repo.

Now lets come to the training part using NEAT (NeuroEvolution of Augmenting Topologies). NEAT is an evolutionary algorithm that creates artificial neural networks, a detailed description of the algorithm is here. The basic idea here is instead of relying on a fixed structure for the network, NEAT allows it to evolve through a genetic algorithm. So it builds a most optimal network by itself by adding nodes, connections and layers as and when required to accomplish the task in hand.

Once we have a game ready, we feed the below inputs/parameters to the NEAT algorithm to create a best neural network which accomplishes the task, in our case rolling the ball through the gaps.

We pass all these parameters through a Configuration file.

Inputs for the Network: Telling the NEAT algorithm what inputs the network can expect and what output we need.

  • Num_Hidden: I have set the number of hidden layers to 0 as it is a simple game, we can try with another number if required.
  • Num_Inputs: I am passing 3 inputs for the network. Position of the Ball on Y axis, Distance between the ball and the top pipe and distance between the ball and the bottom pipe.
  • Num_Outputs: It is set to 1, i.e single node in the output. Based on the value of the output node we decide to Jump or No Jump.
# we used tanh activation function so result will be between -1 & 1. if over 0.5 jumpif output[0] > 0.5: 
ball.jump()

Activation Function:

  • Activation default: Which activation function to use to determine the output value, in this case, I used Tanh. We can use other activation functions like sigmoid/relu etc.
  • Activation_mutate_rate: Probability of picking the other activation functions during mutation.
  • Activation options: Other Activation functions to use randomly during the mutation/breeding.

Fitness Parameters: Way to evaluate how good the network/ball is like distance.

  • Fitness_criterion: It can have the values of min/max/mean. Which tells the NEAT how to pick the best network/ball. The ball with the maximum fitness score is the best.
  • Fitness_threshold: Max fitness score to reach before stopping the training process.
  • Pop_size: Arbitrary value and we can play around. It indicates how many balls in each generation. Start the Generation Zero with 20 members/balls, test them, select the best, breed/mutate them to create the next generation of 20 balls and continue the process.
  • Reset on extinction: If this evaluates to True, when all species(balls) in a simultaneously become extinct due to stagnation, a new random population will be created.

In NEAT we call the population members as genomes/species and once the properties of genomes are set as above, neat starts building a network by creating a population of balls which we set it to 20. Each ball is associated with a completely random neural network that controls it & each network has its own random weights and biases. So we test each of the neural networks and see how well they do by evaluating their fitness. Fitness depends on what task/game we play, in this case how far the ball progresses. Every pipe it crosses we are adding 1 to its fitness score in the game.

At the end of the first generation when all the balls are exhausted, neat sees which among them performed the best. It then picks those best of the last generation of balls/networks (two or three with the highest score), mutate and breed them to create a new set of population. A detailed explanation of how neat actually breeds and mutates these neural networks are in this paper.

Now we will have the off-springs of the best species/balls from the last generation. So we hope that these next generation of species/balls perform better compare to the previous generation. What neat actually does is, it updates the weights, randomly adds/removes the nodes and connections until it finds an architecture that works best for the problem we are solving. It starts with a simple network and gets complex if required. We need to continue the process until we are satisfied with the performance.

After a few generations of slowly learning and getting better, it finally pick up a pattern of moving ahead without hitting the pipes. In this case with the current set of parameters, in the 4th generation itself, AI starts performing better and reached a point where it never fails as we could see in the above gif image.

Once the score reaches the fitness threshold we set, then we can exit the training by saving the neural network associated with that ball using pickle. Then create a game with one ball and the best Network we saved. Now it plays seamlessly by never hitting the pipe.

Thanks for reading, happy learning. :-)

--

--