The world’s leading publication for data science, AI, and ML professionals.

Part #2: a statistical analysis on Serie A

Road to UEFA Champions League

Photo by Mitch Rosen on Unsplash
Photo by Mitch Rosen on Unsplash

Football. The most popular sport in the world. A concentrate of passion, hope and romanticism. Every year thousands and thousands of teams compete in their leagues with different purposes. Some of them are built to win the title. Others just want to not be relegated.

But the answer to their hopes always relies on the same thing: numbers.

One point more and you succeed. One point less and it’s a failure. One goal more and you are the champion. One goal less and you throw away the all season. It’s a matter of details. Every football fan knows that. But can we quantify these ”details”?

In this series of articles, I’ll try to extrapolate these magical numbers looking at the history of the Italian league: Serie A.

Navigation

  • Recap
  • Intro
  • Results

Recap

In this article, I’m going to continue my discussion on Serie A stats, particularly focusing my attention on the positions of the table which guarantee the qualification to the European Cups: Champions League and Europa League.

In the previous article – Part #1: a statistical analysis on Serie A – I analyzed the history of relegation. The presented SVM approach, found out that the 40 points threshold – set by the media as the required number of points to survive in the league – is a relatively large approximation of the resulting one, which is about 3 points lower. In other words, teams tend to survive in the league with less than 40 points in most cases.

We also noticed that a relationship between the number of points and other metrics exist. The most effective one is certainly represented by the combination: number of points and goal difference.

We then recalled the history of Crotone, survived at the very last game in the season 2016/17, after a memorable run-up. Exploiting the obtained results, we noticed that Crotone had just 23.26% of probability of remaining in the league, considering the final table.

Just to recall the positions of the table that we will cover in this series of articles, a recap of the table division is reported here:

  • Champion → 1st position
  • Champions League → 2nd, 3rd, 4th positions
  • Europa League → 5th, 6th positions
  • Relegated → last three positions
  • Survived → all the others

Note: Champions League and Europa League positions, have changed over the years. The number of available positions for these competitions depends on the coefficient set by UEFA for each European country, which changes every year considering the results obtained in the last 5 years. This coefficient depends on the number of victories and good placings of teams belonging to that country in any European competition. Anyway, I decided to apply this division because – in most of the cases – Italy competed with 4 teams in Champions League. The champion is obviously included in the 4.

The question we will try to answer this time is: how many points does a team need to qualify for Europa League/Champions League?

We will use the same Data normalization techniques used in the previous article. So, to solve the fact that there is a difference in the number of points assigned to a victory from 1994/95, I supposed that each victory always had 3 points assigned:

total_points = number_of_victories * 3

Then, in order to compare the number of points obtained in leagues with a different number of participants, a coefficient is applied to the total number of points.

# seasons with 16 teams
coeff = 1.27
# seasons with 18 teams
coeff = 1.12
# seasons with 20 teams
coeff = 1.0

SVM is the model used to separate the data clusters. In particular, we used a linear kernel with a high penalization factor c.

clf = svm.SVC(kernel='linear', C=1000, probability=True)
clf.fit(np.transpose([x1,x2]), y)

If you’d like to have more details either on the data normalization procedures or on the model used, I invite you to read the first part of this series of articles.

Intro

Last time, when we discussed relegation, we were able to find some guidelines to understand if our results were in some sense valid or not. One of these guidelines was certainly the "40 points" threshold. For what concerns the European qualifications, there is no threshold set by the experts. So we have to make some premises first.

Giving the fact that the results obtained on relegation reflected the expected ones, I will suppose that the same will happen to the other parts of the table. Obviously there is no guarantee that this supposition is right. Anyway, I believe the values that will come out from our analysis, are not too much far from the real ones.

Photo by Fikri Rasyid on Unsplash
Photo by Fikri Rasyid on Unsplash

Results

This section will be divided into two parts: Europa League and Champions League. In the first part, I will focus on the number of points required to access Europa League competition with respect to a simple survival. In the second part, there will be a direct comparison between the two European cups, to establish the required number of points to access the most important and fascinating competition in Europe.

1. Europa League

Europa League is the second most important European competition and every year involves a lot of teams that have made the history of Football. Moreover, it embraces also all third-placed teams of each Champions League group which fail to move forward in the competition.

Qualifying to Europa League is the minimum accepted result for teams that started the league with the goal of reaching Champions League placings – actually in the worst cases, it is still insufficient to save the season.

At the same time, Europa League is a fantastic achievement for those who were hoping only for survival.

We start our analysis by looking at the number of points compared to the number of goals scored by a team.

In blue we can see the survived teams; in green we have the teams qualified for Europa League.

The first thing that we notice is that a reasonable threshold can be found between 59 and 62 points.

Actually, this trend is varied in the last 5 years. As you can see from the graph below, the EL threshold is significantly increased.

In other words, in the last 5 years, a team needs 63 points to avoid problems. And a more significant metric that comes out, is that the number of goals scored is pretty irrelevant. This is likely due to the fact that the number of goals scored has generally increased a lot in recent years.

I will compare these two graphs by recalling the history of Napoli in 2019/20. Napoli ended the season with 62 points and 61 goals scored. If we consider the entire history of Serie A, the probability of a good placing for EL with these numbers is 63.2%. But, if we only look at the last 5 years, the probability drops to 30.12%!

I think that the reason of such a drastic change is in the level of the league which is increased with respect to the last decade. The fact that more than a team competes for the title, inevitably makes harder to reach the same goals, raising the bar accordingly. Anyway, in my opinion this is just a short-term cycle that eventually will finish, so considering the entire history of the league is still a better choice.

But, as recent years suggest, goals scored don’t give us a good metric for understanding the whole situation. As already happened for relegation, we also use goal difference.

First of all, we notice that there is an inverse proportionality between the number of points and the goal difference. To reach EL, you need more points as your goal difference decreases, which is absolutely reasonable.

In particular, a team needs about 58 points with a goal difference equal to 10, in order to get the qualification with a probability of 62.55%. So, using the previous example, Napoli in 2018/19 had the 85.33% of probability of success with a goal difference of 10. Again, if we consider only recent years, this drops to 38.58%, which is much worse than before in terms of dropping factor.

Summing up, we can state that the threshold of qualification to EL must take into account the relationship between number of points and goal difference. Moreover, the required number of points to reach this goal, goes from 58 to 62 with a goal difference that respectively goes from 15 to 0. These combinations lead to an average probability of success of about 70%.

Considering the short-term cycle that we are living in, the right combination becomes 63 points with a goal difference equal to 10, which leads to the same probability of success.

Photo by Daniel Norin on Unsplash
Photo by Daniel Norin on Unsplash

2. Champions League

The Champions League is the competition everyone wants to be in. – Steven Gerrard

Champions League is the most important European competition. It is considered the most valuable trophy by any European football club. Even more than the national titles. Qualifying for the Champions League means a lot in terms of appeal and earnings. Building a technical project to ensure the club constant participation in this competition, requires lots of efforts and organization.

Although the name suggests that it is a cup reserved for national leagues champions only, the most important European leagues are allowed to participate with more teams. In particular, Italy participates with the champion plus the next 3 teams in the table.

In this section, I’d like to set another threshold between the Europa League teams and the Champions League ones. In other words, I will try to answer to the question: how many points does a team need to qualify for UEFA Champions League?

For this analysis, I will directly take into account the relationship between number of points and goal difference only, as it turned out to be the best in the previous scenarios.

As always, blue dots represent EL qualified teams and green dots represent CL qualified teams.

Looking at the graph, we soon notice that the answer to our question must be around 67 points with a goal difference that again is inversely proportional to the number of points. To better explain the situation, I will use Atalanta in 2018/19.

Atalanta ended the season with 69 points and a goal difference of 31. According to our model, such a combination leads to success the 94.29% of the times. That’s a high success rate!

But, if we only look at the last 5 years, the probability of such a goal is 46.85%. Indeed – in 2018/19 – Atalanta managed to qualify for the Champions League thanks to only one point more than Milan, who finished at 68 with a goal difference of 19. Not so reassuring anymore, is it?

This is just a confirmation that we are going through a short-term cycle where both European bars have been raised due to a moment of high uncertainty in the league.

The reason why I am not considering this short-term cycle in my analysis is only because it is subject to high volatility. It’s not that different from a stock market trend when you think about it.

So, concluding, the right combination in this case could be a number of points in the range 67–70 with a goal difference at least greater than or equal to 10. This will lead to probabilities of success that go from 63.24% to 76.45%, respectively.


Links

If you missed the first article of this series, you can find it here:

Thanks so much for taking the time to check out this post!


Related Articles