Visualising Elections with Python

Tom Titcombe
Towards Data Science
8 min readJun 2, 2019

--

It would be fair to say that the global political landscape has a lot going on right now, whether you’re American, Israeli, Venezuelan, or anything in-between (as a Brit, I have my own hellscape to endure, in the form of Brexit). Regardless of the nuances of any political situation, one thing they all have in common is the wealth of diverse, complex data they produce, waiting to be explored. For example, consider what you might need to know to perfectly predict an election night performance: comprehension of manifesto policies and their consequences; a psychological understanding of how people will react to the policies and the politicians; sociological knowledge of demographic shifts and how people vote as a community; and game theory to predict how people will vote under the given system.

But before all of that, a would-be political scientist must be able to capture and visualise the mood of the people. In this post, we will be taking that first small step towards election night supremacy, by plotting an election map of the UK.

Such maps, formally known as choropleth maps, come in many flavours. I will detail how I used Python to create two types of choropleth maps: geographically realistic maps of the UK, and hex maps, as seen below. Analysis of the visualisations created here will be covered in a further post. The code used to generate the maps can be seen on my Github.

A giant election map on the BBC piazza. I’ve opted for a slightly higher-tech approach.

Creating Geographical Maps

To create a basic choropleth of the UK, I opted to use the Python package GeoPandas, which extends Pandas to work with geospatial data.

GeoPandas makes it trivial to plot a map using a shapefile, a popular format for GIS data. Luckily for those attempting to map the UK, maps divided into constituencies — the UK’s voting boundaries — are made freely available by the Office of National Statistics. Just download the map of whichever granularity you choose as a shapefile from the site.

Note: shapefiles consist of several individual files, which contain separate information. You need to keep all the files which are provided in the zip folder.

To load and plot the shapefile in GeoPandas:

import geopandas as gpd
import matplotlib.pyplot as plt
map_df = gpd.read_file("uk_generalized_2015.shp")
# map_df is a Pandas dataframe
f, ax = plt.subplots(1, figsize=(6, 6))
ax = map_df.plot(ax=ax)
ax.set_axis_off()
Sorry, Northern Ireland.

As we can see, GeoPandas turns the shapefile into the dataframes we all know and love.

Adding some zest

We have started off well, but the map looks quite uninteresting. We now need to get hold of some constituency-level variable by which we can colour the map.

For the purposes of this post we’ve decided to use the percentage of the electorate who signed the petition to Revoke Article 50. A brief primer for those who were unaware: UK citizens can start petitions, through an official government website, which bind the government to debate the topic of the petition in parliament if a certain number of signatures are reached. In March 2019, a petition to revoke Article 50, the legal mechanism by which the UK leaves the EU, reached millions of signatures in just a couple of days.

Each petition is accompanied by a JSON which, among other things, breaks down the signature count by parliamentary constituency, thus we can easily plot a map to visualise the national mood on any petition. Before we can add this data to the dataframe, however, we need to convert the JSON to a csv.

import csv
import json
file = "petition_data.json"
with open(file, "r", encoding="utf-8") as json_file:
data = json.load(json_file)
data = data["data"]["attributes"]["signatures_by_constituency"]
keys = data[0].keys()
save_file = "petition_data.csv"
with open(save_file, 'w') as f:
dict_writer = csv.DictWriter(f, keys)
dict_writer.writeheader()
dict_writer.writerows(data)

Now we need to combine the data, and calculate the signature count as a percentage of electorate. Note that to do this we also need information on the size of the electorate in each constituency, data which can also be obtained on the ONS website.

import pandas as pddata_df = pd.read_csv("petition_data.csv")# Combine the two dataframes by their constituencies
combined = map_df.set_index("pcon15nm")
.join(data_df.set_index("name"))
.reset_index()
# Also join the dataset containing electorate numbers
# Not included here for brevity
# Clean the data
combined["electorate"].fillna(1e8, inplace=True)
combined["mp"].fillna("No MP", inplace=True)
combined["signature_count"].fillna(0, inplace=True)
# Calculate the percentage votes
combined["signature_pc"] = 100 * combined["signature_count"] /
combined["electorate"]

While the petition from which we retrieved this data was extremely popular, the maximum signature percentage of any constituency was only around 40%. We set the colourmap maximum to to this value, otherwise the majority of constituencies are shifted so far to one end of the colour spectrum as to be indistinguishable.

fig, ax = plt.subplots(1, figsize=(6, 6))
combined_data.plot(column="signature_pc", cmap="viridis",
linewidth=0.8, ax=ax)
ax.axis("off")

vmin = 0.
vmax = 40.
sm = plt.cm.ScalarMappable(cmap="viridis",
norm=plt.Normalize(vmin=vmin,
vmax=vmax))
sm._A = []
cbar = fig.colorbar(sm)
Overly large colourbars are in this year.

While it’s not the prettiest map you’ll ever see, it allows us to quickly and easily visualise the popularity of something throughout the country.

Creating Hex Maps

Looking at the above map, you would be forgiven for thinking that, because the petition only reached a high percentage of signatures in a few constituencies, it was, on the whole, unpopular. However, those of you familiar with the country may recognise the bright spot towards the bottom right of the map. That bright spot is London, and is home to several million people.

This perfectly demonstrates how maps plotting percentage, rather than absolute, popularity can mislead people when drawing conclusions. While maps such as these are great for demonstrating the geographic distribution of a thing, we are biased to lending more importance to larger regions.

Endeavoring to create unbiased visualisations, however futile this may ultimately be, we will now plot hex maps — maps in which each constituency is represented by an equally sized hexagon. These maps trade geographic realism for equalising the importance of each region of the map.

To begin, we need coordinates of each constituency hexagon. Because the sizes of regions are massively distorted in hex maps (in the UK, the smallest constituency is a single square mile, and the largest is over 4000), it will not be possible to align constituencies as you would on a more realistic map.

For this post, we get the base map from ODILeeds, who have created a fantastic tool to generate your own hex map of the UK. Once you’re happy with the map, download the HexJSON and convert it to a CSV, as before.

Unfortunately, researching out-the-box ways to plot such maps in Python did not yield success: Matplotlib’s hexbin, while similar in intent, bins data points under the hood and proved too difficult to organise correctly. It became necessary to draw individual hexagons, for which Matplotlib Patches were used. Below is a simple example of how this can be achieved:

from matploblit.patches import RegularPolygon
import numpy as np

d = 0.5 / np.sin(np.pi/3) # radius of poylgon
o = 0.5 * np.tan(np.pi/3) # orientation of polygon
y_diff = np.sqrt(1 - 0.5**2)
# difference in y location for consecutive layers

colors = ["red", "blue","green"]
hcoord = [0.5, 0, 1]
vcoord = [y_diff, 0, 0]

fig, ax = plt.subplots(1)

for x, y, c in zip(hcoord, vcoord, colors):
hex = RegularPolygon((x, y), numVertices=6, radius=d,
facecolor=c, alpha=0.2, edgecolor='k')
ax.add_patch(hex)
ax.set_xlim([-0.5, 1.5])
ax.set_ylim([-0.6, 1.45])
plt.show()
Great maps, like great food, are built on honeycombs.

Before creating the full-UK map, one remaining challenge is to work out the coordinates of the hexagons: consecutive rows have an alternating start point, as is the nature of the honeycomb. The map provided by ODILeeds uses the “odd-r” formation, which means odd numbered rows contain the rightmost hexagons.

def calc_coords(row, column):
if row % 2 == 1:
column = column + 0.5
row = row * y_diff
return row, column

To create the UK, we iterate through the constituencies and draw a hexagon for each region. However, manually providing the colour, as above, would be a tedious task when attempting to create a smooth colourmap. We make use of PatchCollection to group the hexagons, and provide a Matplotlib colourmap to do the heavy, colourful lifting.

from matplotlib.collections import PatchCollectionfig, ax = plt.subplots(1, figsize=(6, 6))
ax.axis("off")
patches = []
colours = []
for i in range(combined.shape[0]):
# here, combined is a join of hex csv and petition data
# creating combined has been omitted for brevity
row = combined.loc[i, "r"]
col = combined.loc[i, "q"]
row, col = calc_coords(row, col)
c = combined.loc[i, "signature_pc"]
hexagon = RegularPolygon((col, row), numVertices=6, radius=d,
edgecolor='k')
patches.append(hexagon)
colours.append(c)
p = PatchCollection(patches, cmap=plt.get_cmap("viridis"),
alpha=1.0)
Title and labels not included.

With this map, we can see the full extent of London and how it compares to surrounding areas. Knowing that constituencies have roughly similar populations, we can see much more clearly how widespread this petition was in the UK.

Final Thoughts and Next Steps

Beyond what’s shown in this post, I have used data taken from the Revoke Article 50 petition at regular intervals to visualise the spread of the petition throughout the UK. The resulting maps can be seen on the GitHub. My sincerest thanks to Stuart Lowe at ODILeeds for collecting and providing the petition data.

While I would consider this a successful exercise in retrospect, the process of creating the hex map has convinced me that using exotic maps to visualise elections is not a job for which Matplotlib is well suited. If I or anyone reading this decides to create more complex maps, it would probably be necessary to look to other tools, such as d3 or altair.

Unsurprisingly, the world of election coverage and analysis extends far beyond the scope of this post. There are a great variety of maps used to cover elections around the world, each one deployed to convey slightly different meaning. However, the two types explored here could form the basis for quite informative political analysis.

After this project, I, for one, am looking forward to the next election in the UK, for a chance to make use of these maps, if nothing else. Until then, I’ll have to be satisfied with petitions.

--

--