Interactive Bar Charts with Bokeh

And Other Visual Enhancements Techniques

Ilya Hopkins

Published in

Towards Data Science

7 min readMay 15, 2020

Introduction

This article is the second part of my Bokeh love story. The full story (including Jupyter notebooks and all files) is on my Github. The first part of the story is described in the Medium article “Easy Data Visualization Techniques with Bokeh”.

Bar Charts

No way your data story is full without visualizations, and bar charts are arguably one of the most loved types of categorical data representation. There is a myriad of types, palettes and styles of this Lego-like graphs, and that’s why I decided to devote them a separate article.

Let’s start with the simple vertical and horizontal bar charts. We will get to the more complex ones in a jiffy.

It is pretty straight-forward to draw bar charts with Bokeh. As usual, we need to specify a type of chart (or chose a glyph) and pass the data to the plotting function. Let’s create a vertical bar chart showing changes in measles occurrences in the US over the years 2000–2015 using the same UN world healthcare indicators database.

# Creating a list of categories
years = data[data['country']=='United States of America']['year']#Creating the list of values
values = data[data['country']=='United States of America']['measles']# Initializing the plot
p = figure( plot_height=300, 
           title="Measles in the USA 2000-2015",
          tools=TOOLS)#Plotting
p.vbar(years,                            #categories
      top = values,                      #bar heights
       width = .9,
       fill_alpha = .5,
       fill_color = 'salmon',
       line_alpha = .5,
       line_color='green',
       line_dash='dashed'
      
  )#Signing the axis
p.xaxis.axis_label="Years"
p.yaxis.axis_label="Measles stats"show(p)

Voila!

Absolutely in the same fashion, we could create horizontal bar charts. Let’s use reported polio rates for Argentina in 2000–2015 for illustration purposes.

# Creating a list of categories
years = data[data['country']=='Argentina']['year']#Creating the list of values
values = data[data['country']=='Argentina']['polio'].values# Initializing the plot
p = figure( plot_height=300, 
           title="Polio in the Argentina 2000-2015")#Plotting
p.hbar(years,
       left = 0,
       right = values,
       height = .9,
       fill_color = 'azure',
       line_color='green',
       line_alpha=.5
      
  )p.xaxis.axis_label="Years"
p.yaxis.axis_label="Polio stats"show(p)

The code is super-intuitive; we just have to remember we are working with horizontal bars.

Our bar charts are rendered in a very simple manner, and they definitely can benefit from some added make-up.

Styling Bar Charts

For a list of some available palettes please visit Bokeh palettes documentation. In order to use any of them with Bokeh we need to import them specifically.

Let’s look at the measles data for a number of countries in 2015 — we’ll render two graphs with a pre-set palette and a randomly chosen colors, as well as we will use the *gridplot* technique.

Palettes and Gridplot

#Importing a pallettefrom bokeh.palettes import Spectral5, Viridis256, Colorblind, Magma256, Turbo256# Creating a list of categorical values values = data[(data['year']==2015)&(data['country'].isin(countries))]['measles']# Set the x_range to the list of categories abovep1 = figure(x_range=countries,
           plot_height=250, 
           title="Measles in the world in 2015 (pre-set pallette)")# Categorical values can also be used as coordinatesp1.vbar(x=countries, top=values, width=0.9,
      color = Spectral5, fill_alpha=.75)# Set some properties to make the plot look betterp1.yaxis.axis_label="Measles stats"
p1.xgrid.grid_line_color='gray'
p1.xgrid.grid_line_alpha=.75
p1.xgrid.grid_line_dash = 'dashed'
p1.ygrid.grid_line_color='blue'
p1.ygrid.grid_line_alpha = .55
p1.ygrid.grid_line_dash = 'dotted'p2 = figure(x_range=countries,
           plot_height=250, 
           title="Measles in the world in 2015 (randomly selected colors from a pallette)")# Categorical values can also be used as coordinatesp2.vbar(x=countries, top=values, width=0.9,
      color = random.sample(Viridis256,5), fill_alpha=.75)# Set some properties to make the plot look betterp2.yaxis.axis_label="Measles stats"
p2.xgrid.grid_line_color='gray'
p2.xgrid.grid_line_alpha=.75
p2.xgrid.grid_line_dash = 'dashed'
p2.ygrid.grid_line_color='blue'
p2.ygrid.grid_line_alpha = .55
p2.ygrid.grid_line_dash = 'dotted'p = gridplot([[p1,None],[p2,None]], toolbar_location='right')
show(p)

And here’s the result:

This plot looks much friendlier then the ones we started with. And there’s no end to experiments with colors and palettes.

Grouped Bar Charts

Sometimes we need to plot a grouped bar chart. For example, we might need to group our health indicators for some countries. For that, we need to import a special procedure from the bokeh.models module — FactorRange. Let’s look at the data for measles, polio and hiv/aids*1000 for our list of countries for 2014.

from bokeh.models import FactorRange#List of used statisticsstats = ['measles','polio','hiv/aids*1000']#Creating a dictionary of our datamdata = {'countries' : countries,
        'measles'   : data[data['year']==2014][data['country'].isin(countries)]['measles'],
        'polio'   : data[data['year']==2014][data['country'].isin(countries)]['polio'],
        'hiv/aids*1000'   : data[data['year']==2014][data['country'].isin(countries)]['hiv/aids']*1000}# Creating tuples for individual bars [ ("France", "measles"), ("France", "polio"), ("France", "hiv/aids*1000"), ("Canada", "measles"), ... ]x = [ (country, stat) for country in countries for stat in stats ]
counts = sum(zip(mdata['measles'], mdata['polio'], mdata['hiv/aids*1000']), ())#Creating a column data source - Bokeh's own data type with the fields (Country,[stats],[values],[colors]) source = ColumnDataSource(data=dict(x=x, counts=counts, color=random.sample(Turbo256,15)))#Initializing our plotp = figure(x_range=FactorRange(*x), plot_height=350, title="Health Stats by Country")#Plotting our vertical bar chartp.vbar(x='x', top='counts', width=0.9  ,fill_color='color',  source=source)#Enhancing our graphp.y_range.start = 0
p.x_range.range_padding = 0.1
p.xaxis.major_label_orientation = .9
p.xgrid.grid_line_color = Noneshow(p)

And here’s the plot:

We could use a zillion of possible in-built methods to adjust the visual to our liking as well.

Color Transformations

Quite often we are not satisfied with a pre-set or a random palette, and we need to use some additional colormapping. That’s the situation when we use factor_cmap function imported from bokeh.transform module. Let’s look at the Canadian data for measles, polio and hiv/aids*1000 in 2000, 2005, 2010 and 2015 respectively.

from bokeh.transform import factor_cmap#List of used statisticsstats = ['measles','polio','hiv/aids*1000']
years = ['2000','2005','2010','2015']#Creating a dictionary of our datamdata = {'years' : years,
        'measles'   : data[data['country']=="Canada"][data['year'].isin(years)]['measles'],
        'polio'   : data[data['country']=="Canada"][data['year'].isin(years)]['polio'],
        'hiv/aids*1000'   : data[data['country']=="Canada"][data['year'].isin(years)]['hiv/aids']*1000}# Creating tuples for individual bars x = [ (year, stat) for year in years for stat in stats ]
counts = sum(zip(mdata['measles'], mdata['polio'], mdata['hiv/aids*1000']), ())#Creating a column data source  source = ColumnDataSource(data=dict(x=x, counts=counts, color=random.sample(Turbo256,12)))#Initializing our plot with random colorsp1 = figure(x_range=FactorRange(*x), plot_height=350, title="Health Stats in Canada 2000-2015")#Plotting our vertical bar chartp1.vbar(x='x', top='counts', width=0.9  ,fill_color='color',  source=source)#Enhancing our graphp1.y_range.start = 0
p1.x_range.range_padding = 0.1
p1.xaxis.major_label_orientation = .9
p1.xgrid.grid_line_color = None#Creating a new column data source without set colors 
  
source1 = ColumnDataSource(data=dict(x=x, counts=counts))#Initializing our plot with synchronized fill colors with factor_cmapp2 = figure(x_range=FactorRange(*x), plot_height=350,
            title="Health Stats in Canada 2000-2015, color mapped"
           )p2.vbar(x='x', top='counts', width=0.9,
        source=source1,
       fill_color=factor_cmap('x', palette=['salmon', 'green', 'navy'], 
factors=stats, start=1, end=2))p2.xaxis.major_label_orientation = .7
p=gridplot([[p1,None],[p2,None]], toolbar_location='right')show(p)

And here we are — the first chart has some random colors, the second one is color factored:

Even though the first one looks funkier, the second one has a much more clear message when a color mapping statistics.

Adding Labels to the Visuals

Plotting a single label in Bokeh is quite straight-forward and doesn’t really require any specific technique. We just need to import the Label class from the bokeh.models.annotations module and its syntax is quite simple.
One just needs to know that Bokeh uses a separate layer for plotting, another one for labeling, etc. We will use an add_layer() method in order to assemble our visual together. Let’s look at an example and create a graph of measles in Spain in 2000–2015.

from bokeh.models.annotations import Label#Initializing our plotp = figure(x_range=(2000,2015), title='Measles in Spain 2000-2015')#Plotting a linep.line(data[data['country']=='Spain']['year'],
      data[data['country']=='Spain']['measles'],
       line_color='navy',
      line_width=3)#Plotting data points as cirlesp.circle(data[data['country']=='Spain']['year'],
      data[data['country']=='Spain']['measles'],
        radius=.2,
        fill_color='yellow',
        line_color='salmon')#Instance of Label class as our 2011 Measles Outbreak labellabel = Label(x=2011, 
              y=max(data[data['country']=='Spain']['measles']),
              x_offset=10, 
              text="2011 Outbreak",
              text_baseline="top")#Adding a layout with our label to the graphp.add_layout(label)#Styling the graphp.xaxis.axis_label = 'Year'
p.yaxis.axis_label = 'Measles stats'
p.xgrid.grid_line_dash = 'dashed'
p.xgrid.grid_line_color ='gray'
p.ygrid.grid_line_dash ='dotted'
p.ygrid.grid_line_color = 'gray'
p.background_fill_color='green'
p.background_fill_alpha=.05show(p)

Voila!

Adding a single “custom” label is really quite simple. The beauty of Bokeh is that adding a whole set of labels is hardly a tad more difficult. Let’s look at the example of polio in India in 2000–2015 and try adding values to every datapoint. We will simply need to use an instance of ColumnDataSource class for that and import from the bokeh.models module the LabelSet class.

from bokeh.models import LabelSet#Instance of ColumnDataSourcesource = ColumnDataSource(data=dict(
    x=data[data['country']=='India']['year'],
    y=data[data['country']=='India']['polio'],
    labels=data[data['country']=='India']['polio'].values))#Initializing our plotp = figure(x_range=(1999,2016),
           y_range=(50,90),
           title='Polio in India 2000-2015')#Plotting data points as vertical barsp.vbar(x = 'x',
         top = 'y',
       width = .8,
        fill_color='azure', fill_alpha = 1,
        line_color='navy', line_alpha=.25,
         line_width=2, line_dash='dotted',
        source=source)#Plotting a linep.line(x = 'x',
       y = 'y',
       line_color='red',line_width=4,
       line_alpha=.5,
      source=source)#Plotting data points as circlesp.circle(x='x',y='y', 
         radius=.2, 
         fill_color='yellow', line_color='red', line_width=2,
         source=source)#Instance of the LabelSet classlabels = LabelSet(x='x',            #positions of labeled datapoints
                  y='y', 
                  text='labels',          #labels' text
                  level='glyph',          #labeling level
                 x_offset=-10, y_offset=15, #move from datapoints
                  source=source, 
                  render_mode='canvas',
                 text_baseline='bottom'   #relative position to datapoints
                 )p.add_layout(labels)p.xaxis.axis_label = 'Year'
p.yaxis.axis_label = 'Measles stats'
p.xgrid.grid_line_dash = 'dashed'
p.xgrid.grid_line_color ='gray'
p.ygrid.grid_line_dash ='dotted'
p.ygrid.grid_line_color = 'gray'
p.background_fill_color='salmon'
p.background_fill_alpha=.05show(p)

And that’s it:

Other Interactive Techniques

There is quite a number of other interactive techniques that can really reshape your visualization and give your data-driven story a whole new dimension. Just to name a few — linking with panning, linking with brushing, hovering, etc. Some of them are illustrated in the corresponding project’s notebook on Github.

Bokeh is truly an endless source of inspiration. I can not praise enough its simplicity, smooth learning curve and wonderful interactive visuals one can render just in a few lines of code!!