Machine Learning and Chess

How I coded my own python chess engine

Ethan Mai

Published in

Towards Data Science

4 min readJan 14, 2021

Source Code: Github | Play against A.I

Quick introduction:

In late 2020, Netflix released the Queen’s Gambit, a TV show taking place in the 1950’s where we follow the journey of a young women thriving to become the best chess player.

However, unlike the protagonist Beth Harmon, chess isn’t really my strongest suit. So I did the next best thing by building my own python chess engine to compensate my poor chess skills.

Episode I: Strategies

When it comes to building any board game engine, there are multiple strategies:

Reinforcement Learning
Minimax
ML model trained on human knowledge
…

I opted for an hybrid of the second and third strategies: As a Data scientist, I am fond of working with large datasets and machine learning models. Also, despite being a little slower, Minimax is a quite reliable algorithm and does a good job at capturing enemy pieces. Reinforcement learning is simply too complex.

My strategy is simple:

First of all, using the machine learning model, the engine will dismiss 50% of the possible moves given a board. It does so by finding the probability of a move being a ‘good move’. All the moves are sorted according to this probability and only the 50th percentile remains.
Secondly, the engine will perform the Minimax algorithm on the remaining moves. Depending on the complexity of the board, the search tree will go to a certain depth.

Episode II: Data

In order to make accurate predictions, we need a lot of data. For this project, I will be using a dataset that I found on Kaggle. This dataset is a collection of thousands of games played by chess grand masters in the .pgn format (portable game notation).

The next step is to extract all the moves from all the games and label them as good or bad.

In my opinion, a ‘good move’ is a move that the winner played at some point during the game. A ‘bad move’ is a legal move that the winner chose not to play.

This python script got the job done leaving me with a brand new .csv dataset.

You can find my dataset on Kaggle and Github:

Chess Moves

A collection of 1.6 million chess moves labeled as good/bad

www.kaggle.com

iAmEthanMai/chess-games-dataset

chess moves dataset 🔢

github.com

We then need to import the dataframe into our Jupyter Notebook. We can simply run the following command in the first code cell:

In[1]: import pandas as pd
       df = pd.read_csv('data.csv')
       print(df.shape)Out:   (1600000,193)

The dataframe’s columns are the following:

The first 64 columns are a representation of the 64 squares of the current board.
Columns from 65 to 128 represent the starting position of a move: 1.0 if a piece was moved from that square, 0.0 otherwise.
Columns from 130 to 192 represent the ending position of a move: 1.0 if a piece was moved to that square, 0.0 otherwise.
The last column is the label: True if it was a good move, False otherwise.

In[2]: df.head()

df.head()

Episode III: The model

In this project I will be using the tensorflow linear classifier. It is a really common model as it is well documented and very performant.

In[3]: linear_est = tf.estimator.LinearClassifier(feature_columns)

Episode IV: The training

To train the estimator we can simply run this command:

In[4]: linear_est.train(input_function)

For more details on the machine learning portion of this project, make sure to check out my Colab Notebook where you can find all the source code.

Episode V: Minimax

The Minimax algolrithm will explore the recursive tree of legal moves up to a certain depth and evaluate the leaves using an evaluation function.

We can then return the largest or smallest child’s value to the parent node depending on the turn. This allows us to minimize or maximize the outcome’s value at each level of the tree. Learn more about Minimax here.

The evaluation function looks for two criteria:

The type of a piece. For instance, a knight is more valuable than a pawn.
The position of a piece. For instance, a knight is strong in the middle of the board, whereas the king would be vulnerable.

evaluation function algorithm

Episode VI: Game handling

In order to get the legal moves and display the board, I used the python-chess library. You can install it with pip by running the following command.

!pip3 install python-chess

This library makes things really conviniant as it takes away the troubles of hardcoding the rules and subtleties of chess.

Episode VII: Play

The only thing remaining is to play the game. If you want to give it a try, run all the code cells and you should be good to go… Good luck!

Play against my A.I

python-chess-engine.ipynb

Conclusion:

Thank you for sticking around, I hope that you’ve enjoyed following this project.

Feel free to connect with me on Github, Kaggle and Linkedin.