Choropleth maps with Folium

Visualization of the gross birth rate across communities in Spain in 2018

Amanda Iglesias Moreno
Towards Data Science

--

A choropleth map is a thematic map in which areas are shaded according to the statistical variable being displayed on the map. This visualization is really useful to represent the variability of a measurement across a region using a sequential color scheme (the higher the measurement the stronger the color). In this article, we are going to learn how to create choropleth maps using Folium to visualize the gross birth rate (births per 1000 inhabitants) across communities in Spain.

Exploratory data analysis and data cleaning

Exploratory data analysis consists of analyzing the main characteristics of a data set usually by means of visualization methods and summary statistics. The objective is to understand the data, discover patterns and anomalies, and check assumption before we perform further evaluations.

In this article, we employ a data set, containing information about gross birth rates (births per 1000 inhabitants) from 1975 until 2018 across communities in Spain.

The data set (option — CSV separado por ;) can be downloaded at the following link:

The first step of the exploratory analysis consists of loading the csv file into a Pandas data frame using the pandas.read_csv function and visualize the first 5 rows with the pandas.DataFrame.head method. We have skipped 5 rows at the start of the file (skiprows=5) and 2 at the bottom (skipfooter=2), since they do not contain useful information (just explanatory notes about the data being displayed).

As can be easily observed, the last column of the data frame does not contain useful information and should be deleted. In addition, we have to modify the name of the first column by a more descriptive one.

After dropping and renaming the columns, we can evaluate the data types and missing values using pandas.DataFrame.info method.

As shown above, the data set does not contain null values, but the data types are not the expected ones. The columns 1975–2018 are of type object, but they contain numerical data. We can convert these columns to floats using the pandas.Series.astype function, replacing before commas by periods with pandas.Series.str.replace.

As we can observe, we have successfully modified the data types of columns 1975–2018 from objects to floats.

Now! We have a cleaned data set to create choropleth maps :)

Choropleth maps with Folium

Modification of the data frame

Folium is a python library that allows you to create multiple types of Leaflet maps. To create a choropleth map, Folium requires a Geo JSON file, including geospatial data of the region. For a choropleth map of Spain, we would need a Geo JSON file that defines the boundaries of all communities. You can download this file from the following link:

To bind the data frame and the json file successfully, the name of the community in the data frame must match exactly the name of the community in the json file. Therefore, we have to analyze which names are contained in the json file and modify the names in the data frame accordingly.

The following code shows how we can obtain the names of the communities in the json file:

Names in the json file

As shown above, the json.load() method returns a Python dictionary. Then, we loop through the dictionary to obtain the names of the communities.

Now, we modify the names of the data frame by the names of the json file. Notice that, we have created a list with json files names in the same order as they appear in the data frame.

Modified data frame

Now! We are ready to use Folium to create the choropleth map.

Generation of a choropleth map

First, we create a Folium Map object, centered around [40.416775, -3.703790] (the location argument allows to center the map in a specific location). We also provide an initial zoom level of 6 to zoom in the map. Despite the initial zoom level, the map generated is interactive, meaning you can easily zoom in and out. Lastly, we specified the tile (Stamen Watercolor).

Once the map object is created, we display the choropleth map using the .choropleth() method (we can also use the folium.Choropleth() function). This method binds the data contained in the data frame with the geometries of the json file.

The following map depicts the gross birth rate (births per 1000 inhabitants) across communities in Spain in 2018.

As you can observe, northern regions present lower birth rates.

Melilla is a Spanish autonomous city located on the northwest coast of Africa, sharing a border with Morocco. This city has the highest birth rate in Spain. To observe properly Melilla, we need to zoom in.

Customization of the map

There are multiple parameters to customize choropleth maps in Folium. The following parameters are the ones we previously used in our visualization:

Map function

  • location → Latitude and longitude of the map.
  • zoom_start → Initial zoom level for the map.
  • tiles → Map tiles.

Choropleth function

  • geo_data → Name of the json file. This file must be located in the working directory.
  • data → Name of the data frame containing the data.
  • columns → Columns employed to generate the choropleth map.
  • key_on → Key in the json file that contains the name of the country.
  • fill_color → Color scheme used in the visualization.
  • fill_opacity → Area fill opacity, range 0–1 (default 0.6).
  • line_opacity → GeoJSON geopath line opacity, range 0–1 (default 1).
  • legend_name → Title for the legend (default empty string).
  • smooth_factor → How much to simplify the polyline on each zoom level.

Next, I explain in detail three parameters that I believe are particularly relevant: (1) tiles, (2) zoom_start, (3) fill_color.

Tiles

Folium contains multiple built-in tilesets to customize your visualization. When creating a map object, we can easily specify a map tile with the tiles keyword.

We can create multiple tile layers using the folium.Tilelayer() function and append them to a map. For switching between layers, we add a layer control object (folium.LayerControl()) to the map as follows.

Tile layers control

Zoom Start

In Folium, maps are interactive, meaning that we can easily zoom in and out. However, when generating a map object, we can specify an initial zoom level using the parameter (zoom_start). As shown below, we create a map object using three different initial zoom levels: (1) 2, (2) 4, (3) 6.

Modification of the initial zoom level

Fill color

A key point to properly convey your message is to choose a fitting color scheme for your design. There are three main types of color schemes: (1) sequential, (2) diverging, and (3) qualitative.

  • Sequential color schemes are logically arranged from low to high, being ideal for representing numeric data that do not contain a critical midpoint or ordered categorical data (low /medium/high).
Sequential color scheme
  • Diverging color schemes highlight values above or below an interesting midpoint value (e.g. the mean). Diverging schemes are used with data that can be meaningfully divided by a midpoint.
Diverging color scheme
  • Qualitative color schemes are used with nominal categorical data (i.e. the data does not have an intrinsic ordering). The difference between categories is shown with different hues, being lightness and saturation similar.
Qualitative color scheme

In our map, we employ a sequential color scheme, since we want to represent a numerical variable (the gross birth rate) without an interesting midpoint. As shown below, we try four different color palettes: (1) ‘YlGnBu’,(2) ‘BuPu’,(3) ‘OrRd’,(4) ‘RdPu’.

Modification of the color

As you can observe, Folium provides great flexibility for designing choropleth maps. Take a look at the documentation to discover more parameters to customize your maps! 🍁!

Informative labels

As an additional feature, we display the name of the community when hovering over using folium.features.GeoJsonTooltip() function. We provide as input the label of the GeoJson ‘properties’ we want to display (in this case name). Next, we define the style of the labels, providing a string in CSS language. This language describes how HTML elements should be displayed.

After creating the GeoJsonTooltip object, you can pass it to the geojson object that is created by the Choropleth function under the hood.

Labels with the name of the communities

And voilà! A map with labels!

I encourage you to try creating choropleth maps with Folium and remember that in all maps, we have visualized the gross birth rate (births per 100 inhabitants) in 2018. If you want to display another year, you just have to provide the year to the parameter (columns) of the folium.Choropleth() function.

In future articles, I will explain how to create maps with other libraries such as Plotly or Geoviews. Sooo, keep reading!

🍀 Amanda 🍀

--

--