Chess2Vec — Map of Chess Moves

Word Vectors for Chess Moves

Published in

Towards Data Science

7 min readDec 14, 2020

Similarity map of chess moves / Image by Author

In this article, I want to analyze which moves in a game of chess are close, in the sense that they often occur in similar situations in games. If two moves often occur after or before the same moves, then these moves are similar in a certain sense.

For example, which move is close to the opening with the queenside pawn “d4”?

Is it possible to recognize a general structure and to represent it visually?

Data source

The source for my analyses are files of games played on the internet chess server Lichess. At https://database.lichess.org/ you can find all the games played on this server collected by month for download in Portable Game Notation (PGN) format. These are simple text files with a defined structure, which on the one hand can be read and written by almost any chess software, and on the other hand, can be easily manipulated as text files. There are millions of games per month, which allows me to do my analyses.

The moves in the files are given in Standard Algebraic Notation. Here an example:

[Event “Rated Classical game”]
[Site “https://lichess.org/mtajwKao"]
[White “eisaaaa”]
[Black “HAMID449”]
[Result “1–0”]
[UTCDate “2016.06.30”]
[UTCTime “22:00:01”]
[WhiteElo “1901”]
[BlackElo “1896”]
[WhiteRatingDiff “+11”]
[BlackRatingDiff “-11”]
[ECO “D10”]
[Opening “Slav Defense”]
[TimeControl “300+5”]
[Termination “Time forfeit”]

d4 d5 2. c4 c6 3. e3 a6 4. Nf3 e5 5. cxd5 e4 6. Ne5 cxd5 7. Qa4+ Bd7 8. Nxd7 Nxd7 9. Nc3 Nf6 10. Qb3 Be7 11. Nxd5 Qa5+ 12. Nc3 O-O 13. Be2 b5 14. O-O Rad8 15. Bd2 Qc7 16. Rac1 Qd6 17. Qc2 Qe6 18. Nb1 Bd6 19. a3 Nb6 20. Qc6 Nfd5 21. Ba5 Rc8 22. Qb7 Qh6 23. h3 Nc4 24. Bxc4 bxc4 25. Qxd5 Rfd8 26. Qxe4 Rd7 27. Bc3 Re7 28. Qf3 Re6 29. Nd2 Rf6 30. Qg4 Re8 31. Ne4 Rg6 32. Qd7 Rf8 33. Nxd6 Rxd6 34. Qc7 Rg6 35. Qh2 Re8 36. d5 f6 37. d6 Rd8 38. Rfd1 1–0

Pre-processing

To be able to use the data for my purposes, I did some pre-processing. With the command line program “pgnextract” all comments and move numbers were removed and each game was written into one line of a text file. With the command:

pgn-extract.exe lichess_db_standard_rated_2016–06.pgn -o2016–06.pgn -bl10 -w1000 — noresults — notags — nomovenumbers — nocomments — quiet

all games from July 2016 are pre-processed.

An example of such a game per line:

End position of the game / Image by Author

The games can be completely reconstructed from the lines. In the example, the Python library “Chess” was used to execute the moves and display the board.

Now I read the text file after pre-processing into a data frame with one game per line.

In this example, I use more than 5.8 million games. By splitting at the spaces, I turn the strings into lists of the individual moves.

0    [f4, d6, Nf3, Nc6, b3, Nf6, Bb2, e6, e3, Be7, ...
1    [e4, e5, Nf3, Nc6, d4, exd4, Nxd4, Nxd4, Qxd4,...
2    [e4, c6, Bc4, d5, exd5, cxd5, Bb3, Nc6, d4, Nf...
3    [d4, c5, c3, d5, e3, Nc6, Nf3, e6, Be2, Nf6, O...
4    [e4, e5, Nf3, d6, Ke2, Nf6, Ke3, Nc6, Qe2, Ng4...
Name: 0, dtype: object

Calculation of the number of different moves that occur in the data results:

Now I calculate the frequency with which each move occurs and select the most common ones for later presentation.

['O-O', 'Nf3', 'e4', 'Nf6', 'd4', 'e5', 'd5', 'Nc3', 'Nc6', 'c5', 'e6', 'c4', 'd6', 'f4', 'b5', 'h3', 'g6', 'h6', 'a6', 'f5', 'c3', 'Be7', 'c6', 'b4', 'Bd3', 'h4', 'a4', 'a5', 'a3', 'h5', 'g4', 'g5', 'Ne5', 'Be3', 'Bg5', 'g3', 'exd5', 'Bc4', 'd3', 'Ne4', 'e3', 'b6', 'Be2', 'Re1', 'cxd4', 'b3', 'f6', 'Nd5', 'Bg4', 'Nxe5', 'Nxd4', 'f3', 'Bd6', 'dxe5', 'Re8', 'Bg7', 'Bd7', 'Nxe4', 'O-O-O', 'Nd7', 'Nd4', 'Be6', 'Qe2', 'cxd5', 'Bf4', 'Bb7', 'Kg7', 'Nxd5', 'Kg2', 'Qd2', 'Bd2', 'Kf7', 'Qe7', 'Bc5', 'Qc7', 'Kf2', 'Nd2', 'Bxf6', 'Ke7', 'exd4', 'Bf5', 'Rc8', 'Ke2', 'Bxf3', 'Ne7', 'Qf3', 'Bb5', 'dxe4', 'Qd7', 'Kh1', 'Rd8', 'Rc1', 'Rd1', 'Qc2', 'Ng5', 'Kf6', 'Kf3', 'Bg2', 'Kf8', 'Ne2', 'Bb4', 'Kh8', 'Kd7', 'Nf5', 'Rb8', 'Qf6', 'Kh2', 'Ke3', 'Kf1', 'Bb2', 'Kg3', 'Qb6', 'Ke6', 'Nc5', 'Ng4', 'Kg6', 'Qd3', 'Rb1', 'Kd2', 'Kh7', 'Nc4', 'Bxe5', 'Nf4', 'Nbd7', 'Kg8', 'Qxd5', 'Kd3', 'Kd6', 'fxe5', 'Bxe4', 'Qxd4', 'Rf8', 'bxc3', 'Bb3', 'Kg1', 'Rf1', 'Bxd5', 'Qb3', 'bxc6', 'Kf4', 'Nbd2', 'Kf5', 'Bf6', 'Qd6', 'Ke4', 'Bxc6', 'Ke5', 'dxc4', 'Kc7', 'Rd2']

In this example, the top 150 moves were selected.

Calculation of the model

In order to be able to calculate a similarity between moves, we use an idea from Natural Language Processing in which words are represented by vectors. Vectors are chosen in such a way that words that often occur in the same context in texts are considered similar and are therefore assigned vectors that have a small distance between them.

In our example, moves in chess are considered similar if they often occur before or after the same moves in games. In the pre-processing, I have chosen to represent the games in such a way that a game corresponds to a sentence and each move corresponds to a word.

To find these vectors, neural networks are trained to predict either a word from the surrounding words or the surrounding words from the word. A “window” of a certain length around a word is considered. In this task, the model “learns” to find the best possible vector representations for the words. We do the same with our chess games as a sequence of moves.

A description of the procedure in the context of natural language can be found at:

A simple Word2vec tutorial

In this tutorial we are going to explain, one of the emerging and prominent word embedding technique called Word2Vec…

medium.com

For the implementation of the procedure, we use the program package “Gensim” which offers a fast implementation. A description of the use of Word2vec with Gensim can be found at:

Word2Vector using Gensim

Intro : The goal is to build Word2Vec Word Embedding model with the help of Gensim for NLP Tasks.

medium.com

The games are used as “sets” in the form of a data frame for calculation. The dimension of the vectors can be freely chosen, a sensible choice depends on the size of the vocabulary, in our case the number of different moves. Usually, several hundred dimensions are used, but since we are working with a rather small amount of data and a small vocabulary size compared to natural language, I chose 100.

The “window” of surrounding moves I have chosen here is 3, so only the previous and the next move are used to calculate the similarity.

The calculation in the example with 6 million games takes several minutes, depending on the hardware. With larger game databases, as can be found on Lichess, it may take considerably longer. Therefore, it makes sense to save a model for later use.

Now we can look at the vectors for individual moves.

Of course, we can’t do much with the numbers alone. But on the basis of this representation, we can now do things like find similar moves to a given one, or display moves on a map depending on their similarity.

Find similar moves

Gensim offers a function to find the most similar words for a given word, i.e. move. The example shows these moves together with a measure of similarity (“distance”). These are indeed moves that a chess player would expect in the vicinity of the move “d4”.

[('cxd4', 0.5224027633666992),
 ('e3', 0.5026373267173767),
 ('c5', 0.45410090684890747),
 ('Nf3', 0.4520830512046814),
 ('Ncxd4', 0.4454481601715088),
 ('Ncd4', 0.4368690550327301),
 ('Nfxd4', 0.42796286940574646),
 ('Ned4', 0.42050701379776),
 ('Nfd4', 0.42037323117256165),
 ('Nbd4', 0.41929712891578674)]

Representation in 2D

Now, as announced at the beginning of the article, I would like to create a map of the most frequent chess moves, on which similar moves are shown close to each other.

In order to be able to represent the 100-dimensional vectors in 2D, we need a method to carry out this reduction of the dimension. The TSNE method has proven to be very useful in such cases.

A description can be found, for example, under:

An Introduction to t-SNE with Python Example

Introduction

to t-SNE with Python Example Introductionmedium.com

The following auxiliary function is responsible for the display of the map and implements the procedure. The color of the dots indicates the frequency of the moves. (Red frequently — blue less frequently)

The following graphic shows a map of the most frequent 150 moves.

As a chess player, you can recognize a structure on this “map”, for example, the groups of moves marked in the illustration are probably to be expected in similar situations in games.

Frequent moves, such as the typical opening moves (“e4”, “d4”, “e5”, …) and the short castling are clearly recognizable from the color code.

A practical application of this vector representation in the creation of chess software could be in the sorting of moves in the search. If a typical expected move from the list of possible moves is searched first in the search tree, the alpha-bata method commonly used in computer chess may produce results more quickly.
More about the structure of typical chess software can be found at:

Writing a chess program in one day

The post is about how to write a simple computer chess program within one day with only a few lines of code. The…

andreasstckl.medium.com