The world’s leading publication for data science, AI, and ML professionals.

The Battle of Interactive Geographic Visualization Part 7 – Bokeh

Using the Bokeh Library to Create Beautiful, Interactive Geoplots

PYTHON. DATA SCIENCE. GEOVISUALIZATION

Photo by delfi de la Rua on Unsplash
Photo by delfi de la Rua on Unsplash

WHERE WE LEFT OFF

In this series, we have already identified, six (6) ways of making beautiful geoscatter plots.

In this article, we’ll learn how to do this with Bokeh. Now, Bokeh is built internally on Panel which is what we used in the first article in the series.

PLOTLY VS. BOKEH

An article written about this sums up the major difference between the two libraries. Plotly is better when we are talking about the ease of use, and by extension, easier to use when doing individual (including 3D) maps.

Where Bokeh shines is with its dashboarding ability. This means that it is easier to create, host, and stylize dashboards.

Without further adieu, let’s start the coding.

CODING

PRELIMINARIES

We will use the following packages to recreate the interactive maps we have with other articles:

import pandas as pd
#Transformation of Geocodes
from pyproj import Proj, transform
from bokeh.plotting import figure, save, show
from bokeh.io import output_notebook
#For the Map Tiles
from bokeh.tile_providers import get_provider, WIKIMEDIA, CARTODBPOSITRON, STAMEN_TERRAIN, STAMEN_TONER, ESRI_IMAGERY, OSM
tile_provider = get_provider(xyz.OpenStreetMap.Mapnik)
#To display properly the maps
import panel as pn
pn.extension()
import warnings
warnings.filterwarnings("ignore")

Just a bit of explanation of the use of the panel library here: We need to use this to make the map available inline of our notebooks. This is important if we want to add multiple datasets to the same map. Normally, however, without the use of these, the map will be displayed on a separate webpage.

LOAD THE DATA

df = pd.read_csv('Coffee Brands Footprint.csv',
                index_col=0)
Image by Author: The first five observations of our dataset.
Image by Author: The first five observations of our dataset.

CREATE BASE MAP – CONVERSION OF GEOCODES

As in the other articles, we need to create a base map where the additional data will be layered.

Notice, however, how, unlike the other articles, we have not made use of GeoPandas. Bokeh, therefore, needs a way to understand geometry data type.

To do this, we need to always convert our set of geocodes, or more precisely, use a projection scale so that Bokeh can plot them properly.

inProj = Proj(init='epsg:3857')
outProj = Proj(init='epsg:4326')
ph_lon1, ph_lat1 = transform(outProj,inProj,115,0)
ph_lon2, ph_lat2 = transform(outProj,inProj,130,25)

The geocodes above were generated from a trial and error process to produce the optimal boundaries for our maps. I suggest you do the same if you want to limit the map to a particular region, country, or city for better focus.

To initialize the base map:

#Initialize the tile that we will use
cartodb = get_provider(CARTODBPOSITRON)
#Initialize the fig object
fig = figure(plot_width=800, plot_height=700,
             x_range=(ph_lon1, ph_lon2),
             y_range=(ph_lat1, ph_lat2),
             x_axis_type="mercator", 
             y_axis_type="mercator",
             tooltips=[
                    ("Cofee Brand", "@brand"), ("Location", "@vicinity")
                    ],
            title="Coffee Shops in the Philippines")
fig.add_tile(cartodb)
fig.xaxis.visible = False 
fig.yaxis.visible = False
show(fig)

Let’s discuss the parts of the codes that are unique to Bokeh’s figure function:

  • x_range, y_range – These are the boundaries of the figure. The X-axis refers to the longitude and the Y-axis refers to the latitude. Adjust these accordingly to create the desired bounds for your map.
  • x_axis_type, y_axis_type – This tells Bokeh how to interpret your axes. "Mercator" should be chosen specifically for non-USA maps.
  • tooltips – The list of variables to be represented in the tooltip. The format for our interactive tooltips should follow a tuple. The first element of the tuple is the name that will be displayed for the tooltip and the second element should be the reference to the dataframe source. The format should be ‘@column_name’.
Image by the Author: Base Map created. Note that while trying to find the perfect boundaries, the axes should ideally be displayed so the longitude and latitude values are visible.
Image by the Author: Base Map created. Note that while trying to find the perfect boundaries, the axes should ideally be displayed so the longitude and latitude values are visible.

CONVERT GEOCODES

As we have noted, pairs of geocodes need to be converted or ‘projected’ for it to make sense for Bokeh. As such, we need to use the following code:

lons, lats = [], []
for lon, lat in list(zip(df["lng"], df["lat"])):
    x, y = transform(outProj,inProj,lon,lat)
    lons.append(x)
    lats.append(y)

df["MercatorX"] = lons
df["MercatorY"] = lats

After the conversion, if we want to display the entire dataset all at once, without referencing the brand, we can use the following code:

fig.circle('MercatorX', 'MercatorY', 
           source=df, 
           size=7,
           fill_color='red',
           line_color='red',
           line_alpha=0.5,
             fill_alpha=0.3)
show(fig)
GIF by the Author: First Interactive Map with All the Coffee Shop Locations in the Philippines
GIF by the Author: First Interactive Map with All the Coffee Shop Locations in the Philippines

Since we want to display each brand in a different color, we need to add them separately. This approach, for the follower of the series, is likened to that of Folium.

TREATING EACH BRAND AS A DIFFERENT DATASET

The way to add in different colors for different brands is to treat each brand as a separate dataset.

But first, we need to establish a color dictionary:

color_dict = {
    "Starbucks": ' #00704A',
    "Coffee Bean and Tea Leaf": '#362d26',
    "Coffee Project": '#654321',
    "Tim Hortons": '#dd0f2d'
}

We need a new set of fig as the last one has been encoded with the red dots already.

#Doing the Fig
fig = figure(plot_width=800, plot_height=700,
             x_range=(ph_lon1, ph_lon2),
             y_range=(ph_lat1, ph_lat2),
             x_axis_type="mercator", 
             y_axis_type="mercator",
             tooltips=[
                    ("Cofee Brand", "@brand"), ("Location", "@vicinity")
                    ],
            title="Coffee Shops in the Philippines")
fig.add_tile(cartodb)

To loop over the datasets:

#Looping over the dataset
for i in color_dict.keys():
    temp = df[df.brand==i]
    fig.circle('MercatorX', 'MercatorY', 
           source=temp, 
           size=7,
           fill_color=color_dict[i],
           line_color=color_dict[i],
           line_alpha=0.5,
             fill_alpha=0.5)

Finally, to display the code, we need to do this:

pn.pane(fig)
GIF by the Author: Similar interactive Map Like What We Have in the Prior Articles
GIF by the Author: Similar interactive Map Like What We Have in the Prior Articles

Note that show(fig) would work fine but may display an error message so this is not what we used in the article.

FINAL REMARKS

We see through the article that it is possible to replicate the same maps we had with the other libraries with Bokeh.

We do see, however, that we have to do certain steps to ensure that Bokeh understands our data types (which is likewise the case for non-geospatial data types) which some data scientist needs to familiarize themselves with.

Bokeh is great with creating a dashboard but we have yet to create one so as far as its full functionalities and features are concerned, we have not seen all. This may be a good project to carry out in the future and Bokeh is an amazing tool to learn and add to the modern geospatial data scientists’ toolbox.

Let me know what you think!


Related Articles