The world’s leading publication for data science, AI, and ML professionals.

Making Interactive Line Plots with Python Pandas and Altair

The plots that will enhance your EDA process

Photo by Neven Krcmarek on Unsplash
Photo by Neven Krcmarek on Unsplash

Line plot is an essential part of data analysis. It gives us an overview of how a quantity changes over sequential measurements. In case of working with time series, the importance of line plots becomes crucial.

Trend, seasonality, and correlation are some features that can be observed on carefully generated line plots. In this article, we will create interactive line plots using two Python libraries: Pandas and Altair.

Pandas provides the data and Altair makes beautiful and informative line plots. Although Pandas is also able to plot data, it is not an explicit data visualization library. Besides, we will make the plots interactive which cannot be accomplished with Pandas.

Let’s start with generating the data. A typical use case of line plots is analyzing stock prices. One of the simplest ways to get stock price data is pandas-datareader library. We first need to import it along with Pandas (already installed in Google Colab).

import pandas as pd
from pandas_datareader import data

We will get the prices of 3 different stocks for a period of 1 year. The start date, end date, and the source need to be specified.

start = '2020-1-1'
end = '2020-12-31'
source = 'yahoo'

There is one more required information which is the name of the stock.

apple = data.DataReader("AAPL", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]
ibm = data.DataReader("IBM", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]
microsoft = data.DataReader("MSFT", start=start ,end=end, data_source=source).reset_index()[["Date", "Close"]]
(image by author)
(image by author)

We now have stock prices of Apple, IBM, and Microsoft in 2020. It is better to have them in a single data frame. Before combining, we need to add a column that indicates which stock a particular price belongs to.

The following code block adds relevant columns and then combines the data frames by using the concat function.

apple["Stock"] = "apple"
ibm["Stock"] = "ibm"
microsoft["Stock"] = "msft"
stocks["Month"] = stocks.Date.dt.month
stocks = pd.concat([apple, ibm, microsoft])
(image by author)
(image by author)

We have also added the month information which might be useful for analysis. We can now start on creating the plots.


Altair

Altair is a statistical visualization library for Python. Its syntax is clean and easy to understand as we will see in the examples. It is also very simple to create interactive visualizations with Altair.

I will briefly explain the structure of Altair and then focus on creating interactive line plots. If you are new to Altair, here is an Altair tutorial as a 4-part series:


Here is a simple line plot that does not possess any interactivity.

alt.Chart(stocks).mark_line().encode(
   x="Date",
   y="Close",
   color="Stock"
).properties(
   height=300, width=500
)
(image by author)
(image by author)

The basic structure starts with a top-level Chart object. The data can be in the form of a Pandas data frame or a URL string pointing to a json or csv file. Then the type of visualization (e.g. mark_circle, mark_line, and so on) is specified.

The encode function tells Altair what to plot in the given data frame. Thus, anything we write in the encode function must be linked to the data. The color parameter distinguished different stock names. It is same as the hue parameter of Seaborn. Finally, we specify certain properties of the plot using the properties function.

One method for adding interactivity to a plot is through selections. A selection in Altair captures interactions from the user.

selection = alt.selection_multi(fields=["Stock"], bind="legend")
alt.Chart(stocks).mark_line().encode(
   x="Date",
   y="Close",
   color="Stock",
   opacity=alt.condition(selection, alt.value(1), alt.value(0.1))
).properties(
   height=300, width=500
).add_selection(
   selection
)

The selection object above is based on the stock column which contains the names of the stocks. It is bound to the legend. We pass it to the opacity parameter so the opacity of a line changes according to the selected stock name.

We also need to add the selection to the plot using the add_selection function. The following two images demonstrate how selection works. We just need to click on the stock name in the legend. Then, the plot is updated accordingly.

(image by author)
(image by author)
(image by author)
(image by author)

Altair provides other options to capture user interactions. For instance, we can create an interactive line plot that is updated with hovering your mouse on it.

The following code creates a selection object that performs the selection we have just described.

hover = alt.selection(
   type="single", on="mouseover", fields=["Stock"], nearest=True
)

We will use the selection object to capture the nearest point on the plot and then highlight the line this point belongs to.

There are 3 components in the following code. The first one creates the line plot. The second one is a scatter plot drawn on the line plot and it is used for identifying the nearest point. We adjust the opacity so that the scatter plot is not visible.

The third one is responsible for highlighting the line that contains the captured point in the second plot.

# line plot
lineplot = alt.Chart(stocks).mark_line().encode(
   x="Date:T",
   y="Close:Q",
   color="Stock:N",
)
# nearest point
point = lineplot.mark_circle().encode(
   opacity=alt.value(0)
).add_selection(hover)
# highlight
singleline = lineplot.mark_line().encode(
   size=alt.condition(~hover, alt.value(0.5), alt.value(3))
)

The interactive line plot can now be generated by combining the second and third plots.

point + singleline
(image by author)
(image by author)
(image by author)
(image by author)

The first image shows the original or raw plot. The second figure shows the updated version as I hover on the plot.


Conclusion

Altair is quite flexible in terms of the ways to add interactive components to the visualization. Once you have a comprehensive understanding of the elements of interactivity, you can enrich your visualizations.

Thank you for reading. Please let me know if you have any feedback.


Related Articles