The world’s leading publication for data science, AI, and ML professionals.

3 Lines of Python Code to Create An Interactive Playable COVID-19 Bubble Map

Using Plotly Express to create a globally COVID-19 confirmed cases bubble map that is playable and interactive.

Photo by Martin Sanchez on Unsplash
Photo by Martin Sanchez on Unsplash

The above GIF shows the interactive bubble chart on the global map that I created using Plotly Express in Python. Please read through if you are interested in creating this in 3 lines of code.

It is no doubt that Plotly is one of the most popular data visualisation library that supports many programming languages. In this article, I’ll demonstrate how to use Plotly Express in Python to plot the global Covid-19 cases data.

About Plotly Express

Plotly: The front-end for ML and data science models

As a sub-module of the Plotly library, Plotly Express is created for delivering the common charts and graphs most efficiently. It is a terse, consistent and very high-level API that can plot your data in fancy graphs.

If you have a lot of customised requirements, Plotly Express is not your choice. You may need to check out the regular Plotly library APIs if so. Otherwise, Plotly Express is highly recommended because it saves you time and produces as fabulous as the regular APIs do.

COVID-19 Data Source

Photo by Mika Baumeister on Unsplash
Photo by Mika Baumeister on Unsplash

Many organisations are providing up-to-date COVID-19 data at the moment. I found this one from the European Union:

Download today’s data on the geographic distribution of COVID-19 cases worldwide

Disclamation: The data sources are simply found via search engines, this article uses the COVID-19 data for the data visualisation techniques demonstration only, so I’m NOT responsible for the accuracy of the data.

You may download the CSV from the above link. However, we can directly read an URL in Python using Pandas.

import pandas as pd
df = pd.read_csv('https://opendata.ecdc.europa.eu/covid19/casedistribution/csv')

Below is how the data looks like. Please note that I’m using sample function rather than head because I want to see data from different parts of the dataset, not only several entries on the top.

In this tutorial, we are going to use the following fields in the dataset:

  • date
  • number of cases
  • number of deaths
  • country
  • continent

Therefore, let’s just remove the unuseful columns and rename the columns as preferences. Please note that this step is optional, but it is a good manner to keep your Pandas data frame tidy.

# Remove unuseful columns
df = df[['dateRep', 'cases', 'deaths', 'countriesAndTerritories', 'countryterritoryCode', 'continentExp']]
# Rename columns
df = df.rename(columns={
    'dateRep': 'date',
    'countriesAndTerritories': 'country',
    'countryterritoryCode': 'countryCode',
    'continentExp': 'continent'
})
# Convert string to datetime
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%Y')
# Preview the data frame
df.sample(10)

Create the Bubble Map

Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash

Before creating the interactive map, let’s create a static map first. Later on, you will be surprised by how easy it is to create a playable interactive map.

For the static map, I would like to create it for "today". So, let’s get the data frame for all the countries globally, but only for "today".

from datetime import datetime
# Get today as string
today = datetime.now().strftime('%Y-%m-%d')
# Get a data frame only for today
df_today = df[df.date == today]
# Preview the data frame
df_today.head()

My current day is 26 April, so the data frame I got is shown as above.

Then, let’s plot the data frame using Plotly Express.

import plotly.express as px
fig = px.scatter_geo(
    df_today, 
    locations='countryCode',
    color='continent',
    hover_name='country',
    size='cases',
    projection="natural earth",
    title=f'World COVID-19 Cases for {today}'
)
fig.show()

Simply the above 3 lines (Well, there are 12 lines here because I want to show them with better readability), the global map with bubbles that indicate the number of confirmed cases for today is shown as follows.

Let me explain the parameters of the px.scatter_geo() function.

  • For the first parameter, we need to provide the Pandas data frame. Plotly Express is fully compatible with Pandas so that we could always use data frames.
  • locations takes the country codes to geocode the countries on the map. Please note that we simply let Plotly Express know that the column named "countryCode" is the column indicating locations. We can also use latitude and longitude. However, since the data source has provided the "ISO-Alpha" country codes so that it is much easier to use it.
  • color is optional here, but it is good to show the bubbles with different colour codes depending on different continents, which makes the graph clearer.
  • hover_name decided what to display when the mouse hovering on the bubble. Here we let it display the country name, so we specify the column name "country" from the data frame.
  • size is an important parameter to determine how large the bubble is on the map. Of course, we want it to indicate how many cases are in each country, so we specify "cases".
  • projection tells Plotly Express the presentation of the map we want to use. There are many other options such as "equirectangular" which will show the map in a rectangle as follows.
  • title is simply setting the title of the graph.

Now, what about creating a "Playable Bubble Map"? Easy!

Let’s abandon the df_today data frame since it only has data for today. Our original data frame df has all the data.

Before we can animate the map, it needs to be mentioned that Plotly Express currently only supports "play" the data based on integer or string type. We want to "play" the bubble map based on the dates, so they need to be converted data types. Also, don’t forget to sort the data frame based on the date, because we want to "play" the map in the natural timeline.

# Convert date to string type
df['date'] = df.date.dt.strftime('%Y%m%d')
# Sort the data frame on date
df = df.sort_values(by=['date'])
# Some countries does not have code, let's drop all the invalid rows
df = df.dropna()
# Preview the data frame
df.head(10)

OK. The data frame is ready now. Please refer to the previous px.scatter_geo() function that we created for today’s data only. To make it "playable", simply let Plotly knows which column will be utilised for "playing" animation_frame="date". The full function is as follows.

fig = px.scatter_geo(
    df, 
    locations='countryCode',
    color='continent',
    hover_name='country',
    size='cases',
    projection="natural earth",
    title=f'World COVID-19 Cases',
    animation_frame="date"
)
fig.show()

We can also drag the progress bar to quickly navigate to certain dates, as well as interact with the map to zoom in/out and panning.

See how easy it is! Thanks to Plotly, we can create fancy graphs in seconds.

Summary

Photo by Aaron Burden on Unsplash
Photo by Aaron Burden on Unsplash

Indeed, the features of Plotly Express is kind of limited. For example, if we want to customise the scale of bubble size on the map (some countries with fewer cases are not visible), it will not be convenient.

However, the limitations are a trade-off because Plotly Express is meant to provide terse API to create beautiful graphs very quickly. So, we can choose to use Plotly regular APIs such as Graph Objects to achieve those customised requirements.

Join Medium with my referral link – Christopher Tao

If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)

Note from the editors: Towards Data Science is a Medium publication primarily based on the study of Data Science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here.


Related Articles