Not long ago, I’ve come across this video[1] by 1littlecoder showing how you can use berserk, the Python client for the Lichess API, to extract information on your chess games. As a regular player on Lichess, I wanted to build a forecasting model that would make predictions on my chess rating based on previous played games. If you’re also a regular Lichess user, you can mimic my methods and see if you extract something interesting.
The first step is to install berserk and import it. Then, you want to get your personal token from Lichess. To do that, click on your Lichess profile on the top right corner, go into Preferences, and click on API access tokens on the bottom left. After getting your API token, just use the last two lines for the authentication. For a full documentation of the berserk package, you can also check out this.
!pip install berserk
import berserk
token = "YOUR_PERSONAL_TOKEN"
session = berserk.TokenSession(token)
client = berserk.Client(session=session)
Now, I’ve imported some useful libraries such as NumPy, Matplotlib, and datetime. Then, I’ve used the method get_rating_history of class users to extract my ratings for bullet Chess. Next, I’ve used create_bullet_list to store them in a list.
import numpy as np
import matplotlib.pyplot as plt
import datetime
%matplotlib inline
entries_of_bullet_ratings = client.users.get_rating_history("bibimbap123")[0]["points"]
def create_bullet_list(bullet_ratings):
lst = []
for entry in bullet_ratings:
lst.append(entry[3])
return lst
ratings = create_bullet_list(entries_of_bullet_ratings)
The following code is to create a list of datetime by first extracting the dates of the games played (times_list), storing them as list of tuples (tuple_to_str), and then converting them into the datetime data types (str_to_datetime).
import calendar
from dateutil import parser
def times_list(bullet_ratings):
tl = []
for entry in bullet_ratings:
tl.append((str(entry[0]), calendar.month_name[entry[1]+1],
str(entry[2])))
return tl
times = times_list(entries_of_bullet_ratings)
def tuple_to_str(time):
l = []
for entry in time:
l.append(', '.join(entry))
return l
str_times = tuple_to_str(times)
def str_to_datetime(time):
l = []
for entry in time:
l.append(parser.parse(entry))
return l
dtime = str_to_datetime(str_times)
Next, let’s plot the time series to get a general picture of my rating progression, from February 2021 (the day I created the account), to December 2021.
fig, ax = plt.subplots()
fig.autofmt_xdate()
plt.plot(dtime, ratings)
plt.show()

Ok, now, to build the model, I’ve used this paper[2], which shows the underlying math behind rating calculation. The first thing is to retrieve my latest rating and my latest rating deviation. The paper explains that the rating deviation is essentially a measure of the rating’s uncertainty.
# Latest rating
old_rating = ratings[-1]
# Latest rating deviation
rating_deviation_old =
client.users.get_public_data("bibimbap123")["perfs"]["bullet"]["rd"]
Now, I carried the following calculation to compute the new rating deviation.

However, I need to solve for c using the equation below.

Here t represents the time (in units of rating periods) it would require so that my rating becomes as unreliable as that of a new player. The rating period is defined as a period of time where all the games are being treated as if they were played simultaneously. For this project, I decided to use monthly rating periods, and set t = 60. This indicates that it would take 60 months (5 years) of inactivity before my rating becomes as unreliable as that of a beginner.
rating_period = 1
# Calculate c
c = np.sqrt((350**2-rating_deviation_old**2)/(rating_period * 60))
# Calculate RD
rating_deviation_current =
min(np.sqrt(rating_deviation_old**2+c**2), 350)
Now, here comes the big computations for post-period rating and post-period rating deviation. The post-period rating is going to be our prediction.

The first thing that I’m going to tackle is the summation in r’. Here, m represents the number of opponents that I am facing in one particular time period (1 month).To start, I need to calculate four things:
- Average number of games per month (m)
- Average opponent rating
- Average opponent’s rating deviation
- Average outcome (s_j)
To calculate 1), I’m going to make the following assumptions: the average number of opponents that I play through a one-year period divided by 12 reflects the number of opponents I play per month.
For 2), I simply looked it up on my Lichess profile; I didn’t find a way to extract it through the API.
For 3), I assumed that the RD is 50 because it’s a common RD for active players.
For 4), I calculated my win rate by dividing the total number of wins by the total number of games. Here, the methods get_public_data allowed me to extract the total number of games played and the total number of games won.
# 1) Take total amount of games played in one year and divide them by 12
entries_of_bullet_ratings =
client.users.get_rating_history("bibimbap123")[0]["points"]
d = {1: 0, 2:0, 3:0,4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0}
for key in d:
for time in dtime:
if time.month == key:
d[key] += 1
#Round up
average_number_of_games = int(sum(d)/12) + 1
# 2) Their average rating
opponent_average_rating = 2145
# 3) Their average rating deviation
opponent_average_RD = 50
# 4) Calculate the average win_rate
win_rate = client.users.get_public_data("bibimbap123")["count"]["win"]/
client.users.get_public_data("bibimbap123")["count"]["all"]
Notice that these calculations imply that I’m essentially playing against the same number of opponents with the same rating and rating deviation. As a consequence, the summation can be replaced by a multiplication by the number of opponents.
In addition, we have to calculate four other values: _q, d², g(RDj), and E(s|r). The meanings of these are not explicitly explained in the paper. Fortunately, however, the formulae are given to us, so we only have to plug and chug.


My code is displayed below:
q = 0.0057565
g_RD = 1/(np.sqrt(1+3*q**2*(rating_deviation_current**2)/np.pi**2))
E_sr = 1/(1+10**(-g_RD*(old_rating - opponent_average_rating)/400))
d_squared = 1/(average_number_of_games * q **2 *
(g_RD ** 2) * E_sr * (1- E_sr))
new_rating = old_rating + average_number_of_games * q/(((1/rating_deviation_current**2) + 1/d_squared)) * g_RD * (win_rate - E_sr)
new_RD = np.sqrt(1/(1/rating_deviation_current ** 2 + 1/d_squared))
The final step is to plot the prediction. Since I treated one rating period as one month, this predicted rating is going to be exactly one month from the time I played my last game. Here, I wanted to make sure that if my last game was played on month number 12 (December), then the prediction would be on month number 1 (January) and not on month number 13 because that doesn’t exist.
last_game_time = dtime[-1]
prediction_month = last_game_time.month+1
prediction_time = last_game_time
if prediction_month == 13:
prediction_time =
prediction_time.replace(year = last_game_time.year + 1)
prediction_time = prediction_time.replace(month = 1)
else:
prediction_time =
prediction_time.replace(month = last_game_time.month + 1)
The predicted rating and the prediction time are displayed below. We can infer from this picture that my last game was played on December 17th, 2021.


As I play more games, this program will make newer and newer predictions on my future ratings, which makes it interesting to see by how much my actual rating and the predicted rating differs by. In the future, it’d be worthwhile to see how we can minimize that difference.
References
[1]1littlecoder, Extract Chess Data from Lichess API with Python (2020), https://www.youtube.com/watch?v=OnCQ3J6ZKL4
[2]M. E. Glickman, The Glicko system (1995), http://www.glicko.net/glicko/glicko.pdf