
Now, the whole of Europe is buzzed about the UEFA Euro 2024 championship. What should we expect there? How have the continent’s teams performed before? To visualize this on a map, I combine Wikipedia data and Natural Earth’s world map with Geospatial data science tools in Python to show how we can easily build custom maps showing any data that is possible to link to countries.
All images by the author.
1. Data acquisition
First, we use a free, open-sourced world map from the Natural Earth initiative. To ensure every nationality is appropriately represented, I used the Admin 0 – Details – map units map from the 10m resoltion cultural maps’ collection. This map accounts for 209 sovereign states in the world.
# Import Geopandas
import geopandas as gpd
# Parse the world map
gdf = gpd.read_file('ne_10m_admin_0_map_units')
display(gdf[['GEOUNIT', 'geometry']].head(3))
gdf.plot()
The output:

Next, I collect official Soccer cup records from Wikipedia, where I simply copy and paste the tabular information stored in the Overall team records table into an Excel spreadsheet first and then parse it with Pandas as follows.
# Import Pandas
import pandas as pd
# Parse the manually prepared excel file
df = pd.read_excel('uefa_euro.xlsx', index_col = 0)
df.head(10)
The output of this cell:

2. Data cleaning and preprocessing
Now let’s clean up the country names in the Wiki table (getting rid of those [] references, for instance), and match the country name lists of the two DataFrames.
2.1. Clean the country names
# getting rid of the additional references in []
df['Team'] = [a.split('[')[0].rstrip().strip() for a in df.Team.to_list()]
# counting the countries
countries1 = df.Team.to_list()
countries2 = gdf['GEOUNIT'].to_list()
print('Number of countries in the Wiki data:', len(countries1))
print('Number of countries in the Natural Earth data:', len(countries2))
The output of this cell:

# comparing the lists of countries
countries1_s = set(countries1)
countries2_s = set(countries2)
overlap = list(countries1_s.intersection(countries2_s))
len(overlap)
The output of this cell shows that there are 33 overlaps between the two sets. Let’s print the three missing countries:
print('The following countries are in the Wiki table but can't find in the Natural Earth data:')
countries1_s.difference(countries2_s)

Now, use these name matches to remap, for instance, the GeoDataFrame containing countries, and then test the number of overlaps between the two DataFrames’ country lists again. Additionally, after some manual inspection of all the elements of countries2, I discovered that Belgium is not present in the spatial data as Belgium; however, it is split into the following separate regions: three separate regions: Flemish Region, Walloon Region, and Brussels Capital Region.
# the cross-dataset country name map
cleaning_map = {'Czechia' : 'Czech Republic',
'Ireland' : 'Republic of Ireland',
'Flemish Region' : 'Belgium',
'Walloon Region' :'Belgium',
'Brussels Capital Region' :'Belgium'}
# remapping the country names
gdf['country'] = gdf['GEOUNIT'].map(cleaning_map).fillna(gdf['GEOUNIT'])
# validating that no countries are missing
print('The number of missing countries:')
print(len(set(df.Team).difference(set(gdf.country))))
The output of this cell tells us that finally, there are no missing countries.
2.2. Prepare the map
Now, let’s merge the two DataFrames and keep the columns that contain the name and geometry of the countries, the number of times they appeared in the championship (Part.’), and the total number of scores (Pts) they achieved.
gdf_uefa = gdf.merge(df, left_on = 'country', right_on = 'Team')[['country', 'Part.', 'Pts', 'geometry']]
gdf_uefa.head(3)

Let’s take a quick look:
gdf_uefa.plot()

The quick look shows that, for instance, Russia heavily falls outside of the typical map view we think of Europe. Now let’s make the map view a bit more familiar by cropping this map using a bounding box covering the vast majority of continental Europe. For this, we will use the Shapely library:
# Import shapely
from shapely.geometry import box
# Define the bounding box for Europe
bbox = box(-10.0, 34.0, 40.0, 72.0)
bbox = gpd.GeoDataFrame({'geometry': bbox}, index=[0], crs='EPSG:4326')
# Crop the UEFA map:
gdf_uefa = gpd.overlay(gdf_uefa, bbox)
# Set and convert he coordinate reference system
gdf_uefa.crs = 4326
gdf_uefa = gdf_uefa.to_crs(3857)
gdf_uefa.plot()

3. Visualizing the UEFA map
Finally, let’s create a visualization where each country is colored based on its total number of points. To illustrate how we can build such a map, I will show you the step-by-step guide here, first only creating a simple colored map.
import matplotlib.pyplot as plt
# Create the plot
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
# Plot the data
cmap = 'Reds'
gdf_uefa.plot(ax=ax, column='Pts', cmap=cmap, edgecolor='grey', linewidth=1)
# Get rid of the plot axis
ax.axis('off')
# Add a figure title
ax.set_title('UEFA Points by Country', fontdict={'fontsize': '18', 'fontweight': '5'})

Now let’s add a basemap using the contextily, set the figure size to a custom x range, and label each country:
import contextily as ctx
# Create the plot
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
# Plot the data
cmap = 'Reds'
gdf_uefa.plot(ax=ax, column='Pts', cmap=cmap, edgecolor='grey', linewidth=1)
ax.set_xlim([-1.5*10**6,5*10**6])
# Get rid of the plot axis
ax.axis('off')
# Add a figure title
ax.set_title('UEFA Points by Country', fontdict={'fontsize': '18', 'fontweight': '5'})
# Add basemap using Contextily
ctx.add_basemap(
ax,
alpha=0.97, # Transparency level for the basemap
crs=gdf_uefa.crs, # Coordinate reference system of the GeoDataFrame
url=ctx.providers.CartoDB.DarkMatter # URL of the basemap provider
)
# Add labels for each country
for x, y, label in zip(gdf_uefa.geometry.centroid.x, gdf_uefa.geometry.centroid.y, gdf_uefa['country']):
ax.text(x, y, label, fontsize=8, ha='center', va='center', color='black', bbox=dict(facecolor='white', alpha=0.6, edgecolor='none', pad=1))

Finally, add a colorbar:
import contextily as ctx
from mpl_toolkits.axes_grid1 import make_axes_locatable
# Create the plot
fig, ax = plt.subplots(1, 1, figsize=(15, 15)) # Increase figure size for better clarity
# Plot the data
cmap = 'Reds'
gdf_uefa.plot(ax=ax, column='Pts', cmap=cmap, edgecolor='grey', linewidth=1)
ax.set_xlim([-1.5*10**6,5*10**6])
# Create a divider for the existing axes instance
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="2%", pad=-9.3)
# Add colorbar to the new axis
sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(vmin=gdf_uefa['Pts'].min(), vmax=gdf_uefa['Pts'].max()))
sm._A = [] # Dummy array for the ScalarMappable
cbar = fig.colorbar(sm, cax=cax)
cbar.set_label('Points', color = 'white', labelpad = -16, size = 16)
cbar.ax.yaxis.set_tick_params(color='white')
plt.setp(plt.getp(cbar.ax, 'yticklabels'), color='white')
# Add basemap using Contextily
ctx.add_basemap(
ax,
alpha=0.97, # Transparency level for the basemap
crs=gdf_uefa.crs, # Coordinate reference system of the GeoDataFrame
url=ctx.providers.CartoDB.DarkMatter # URL of the basemap provider
)
# Add labels for each country
for x, y, label in zip(gdf_uefa.geometry.centroid.x, gdf_uefa.geometry.centroid.y, gdf_uefa['country']):
ax.text(x, y, label, fontsize=8, ha='center', va='center', color='black', bbox=dict(facecolor='white', alpha=0.6, edgecolor='none', pad=1))
# Add a title
# ax.set_title('UEFA Points by Country', fontdict={'fontsize': '15', 'fontweight': '3'})
# Hide axis
ax.axis('off')
# Show the plot
plt.show()
plt.savefig('UEFA.png', dpi = 200, bbox_inches = 'tight')

Conclusion
The above process teaches us a few lessons on spatial data science and European soccer as well. While interpreting the exact meaning of Western European countries, Germany and Italy topping the historical record might spark debates reaching far beyond the scope of this tutorial, we can certainly interpret the more objective Pythonic parts.
To the spatial data science end, we not only overview a few basic tools but also encountered a very common data cleansing issue. We took two very common and established data sets of less than 40 countries and realized that matching even this small amount can be tricky and needs extra data preprocessing steps, illustrating the tricky nature of real-life spatial data. Then, while creating the visualization, we learned about different layers of complexity when it comes to the amount of information displayed. Here, we stopped with a detailed static map. However, this data can further be explored and showcased using interactive tools such as Plotly or Folium.