Two essential Pandas add-ons

These two must-have UIs will help you level-up your Pandas skills

Josh Taylor
Towards Data Science

--

The Python Data Analysis Library (Pandas) is the de facto analysis tool for Python. It still amazes me that such a powerful analysis library can be open-source and free to use.

But it is not perfect…

Yes Pandas does have some shortcomings

There are a couple of frustrations that I have with the library especially when it comes to performing simple filtering and pivoting. There are certain situations where a user interface can really speed-up analysis. Nothing beats ‘drag-and-drop’ for an intuitive way of exploring and filtering data and this is not something that Pandas allows you to do. Thankfully there are two libraries which address these issues and work perfectly alongside Pandas.

Pivottable.js, interactive pivot tables and charts

The pivot table in Pandas is very powerful but it does not lend itself to quick and easy data exploration. In fact things can get very complex very quickly:

Pandas pivot tables can leave you scratching your head. Credit https://pbpython.com

Thankfully there is a fantastic interactive pivot-table and plotting add-on, pivottablejs. It can be installed and run in 4 lines of code:

!pip install pivottablejs
from pivottablejs import pivot_ui
pivot_ui(df,outfile_path=’pivottablejs.html’)
HTML(‘pivottablejs.html’)

This gives you an interactive HTML pivot chart. This can be displayed within a notebook or opened in a browser as an HTML file (this allows it to be easily shared with others):

pivottable.js allows for interactive pivoting and charting of a Pandas dataframe

QGrid: speedy, interactive tables

Fed-up of looking at the first and last 5 rows of a Pandas dataframe? How often have you wished that you could quickly filter and see what is happening with your data. Pandas does provide useful filtering functionality with loc and iloc however in the same way that pivot tables can become quite complex, so can statements using these indexing functions.

QGrid allows you to do this and much more. Some of the key features are:

  • Filter and sort dataframes
  • Scroll through large dataframes without loosing performance (>1million rows)
  • Edit cells in a dataframe directly through the UI
  • Return a new dataframe with the filters/sorts/edits applied
  • Compatible with Jupyter Notebooks and JupyterLab

Installation is simple via pip or Conda:

pip install qgrid
jupyter nbextension enable --py --sys-prefix qgrid
import qgrid# only required if you have not enabled the ipywidgets nbextension yet
jupyter nbextension enable --py --sys-prefix widgetsnbextension
#to show a df simply use the below:
qgrid.show_grid(df)

To get an idea of what is possible, see the below demo from the QGrid Github page:

Visit https://github.com/quantopian/qgrid for more information on QGrid

That’s all. Hopefully these two tools will help speed up your data analysis in Python.

If you know of any other UI libraries for Pandas, please let people know in the comments below.

--

--