
This is the first article of a series that teaches the AI to play Super Mario Land for GameBoy and here I will show you how to develop a Genetic Algorithm AI to play Super Mario Land with Python. (full code on my GitHub link at the end of the article)
Super Mario Land is a platform game created by Nintendo and it tells a story about Mario, a brave plumber who lives in the land of the Mushroom Kingdom and he has the role of saving the princess from the hands of the villain Browser.
The GIF below presents the Genetic Algorithm mastering the first part of World 1–1 of Super Mario Land for GBA. In the next section, I will explain how to program this algorithm by yourself.

Environment
The first step of our journey is the integration between Python and a Game Boy emulator. For this one, I found a nice Python library called PyBoy.
https://github.com/Baekalfen/PyBoy
Following PyBoy instructions, I was able to integrate the Super Mario Land game into Python and develop all the control interactions. To create a first working version of the environment you have to create an "init" function to define all the initial variables as Mario lives, time left, world level, and more. After this, you have to create a function to reset the game when it finishes and a "step" function to allow the AI agent to interact with the game and perform the actions on it.
After creating the environment that allows any Artificial Intelligence program to interact with the game and actually play it let’s choose an algorithm and develop or AI agent.
Genetic Approach
Genetic models are based on algorithms that use the concept of Biologic Evolution to optimize their actions.
"In biology, evolution is the change in the characteristics of a species over several generations and relies on the process of natural selection." [1] Your Genome.org (source)

"The theory of biologic evolution is based on the idea that all species are related and gradually change over time. The theory states that the genetic variation in a population affects the physical characteristics (phenotype) of species and some of these characteristics may give the individual an advantage over the others." [1] Your Genome.org (source).
These physical advantages can then perpetuate to future generations through parents. This theory applied to AI creates self-learning agents that evolve each generation and explore the environment to maximize results.
The idea is simple, for each Generation we create some species and perform Crossover and Mutation to optimize and develop the genes and then perform a Selection of the best species at the end.
Generation
Generation is a set of species that each has particular characteristics generated by mutation and crossover, which are characteristics inherited from their parents.
Applying this concept for AI, each species is born with a set of moves in the first generation then the best species, based on their fitness, are selected to continue to the next generation. The next generation of species then suffers a process of crossover to create children based on the past generation and mutation to generate variability. This process continues iteratively till the last generation.
Selection
The selection part of the algorithm is based on Charles Darwin’s theory of natural evolution.
"Individuals with characteristics best suited to their environment are more likely to survive, finding food, avoiding predators, and resisting disease. These individuals are more likely to reproduce and pass their genes on to their children. Individuals that are poorly adapted to their environment are less likely to survive and reproduce. Therefore their genes are less likely to be passed on to the next generation. As a consequence those individuals most suited to their environment survive and, given enough time, the species will gradually evolve." [1] Your Genome.org (source)
Applying this concept to AI, when we advance to the next generation, we select only the individuals with the best fitness to "survive" and reproduce their "genes" to the future.
Crossover
In the reproduction cycle when parents pass their genes to the next generation their genes suffer a crossover. The crossover process takes half of the genes from Parent 1 and the other half from Parent 2 to generate genes for the next generation.
Mutation
A mutation is a process when part of the genes randomly changes.
These changes can be only minor changes that do not affect the movements or they can lead to whole new characteristics and completely change the species’ behavior. For AI, we perform a mutation by randomly changing agent actions during the model generation.
Fitness
One of the most important variables for a Genetic Algorithm is fitness.
Fitness is the variable that states what we want to maximize for our environment. A minor change in the fitness formula can lead to a huge change in agent behavior.
For Super Mario Land we want Mario to walk forward and kill enemies to finish the stage. So, we develop a positive value when Mario moves forward or kills an enemy and we apply a discount for each second to encourage Mario to move forward and fast.
Experiments and Results
After programming the simulation environment and implement the Genetic algorithm we can start to run simulations and evaluate model performance.
For research purposes, I ran a model with 30 generations and 5 species each to play Super Mario Land in stage 1–1. Here are the results for the first Generation:

Mario was able to walk forward but failed in the first obstacle, a simple Goomba. Let’s see if the agent can perform better after some genetic evolution …
30 Generations after we noticed a huge evolution! Some amazing movements discovered by the AI agent were killing some Goombas and jumping through small pipes and high blocks. It is fantastic to see what AI can do with dynamic programming.

We can follow the evolution through a benchmark chart below. In the first part of the chart, it is the Average Fitness (red line) and Maximum Fitness (blue line) for each generation and we can clearly see the increasing trend in the average and maximum fitness as the generation evolves.
The second part shows the fitness for each interaction and we can see the variation inside each generation that is part of the exploration journey and the increasing of maximum fitness each generation.

If you want to implement this solution or learn more about the genetic algorithms you can find the full python code on my GitHub repo in the link below:
Thank you so much for reading! Any questions or suggestions please contact me via LinkedIn: https://www.linkedin.com/in/octavio-b-santiago/
References
[1] Your Genome, United States, accessed July 2021,<https://www.yourgenome.org/facts/what-is-evolution>
More Reading
Hacking Chess with Decision Making Deep Reinforcement Learning
Reinforcement Learning – Teaching the Machine to Gamble with Q-learning