Jupyter tools to increase productivity

Gautam Borgohain
Towards Data Science
6 min readJun 18, 2018

--

Photo by Philip Swinburn on Unsplash

“If the only tool you have is a hammer, you tend to see every problem as a nail.”

Abraham Maslow

Like for so many other data scientists, Jupyter notebook is an indispensable part of my data science toolkit. It is easy to use and incredibly powerful. In this post I will document some of the extensions and features available that I think everybody using Jupyter should familiarise themselves with.

But wait, isn’t this old stuff now, should we be getting comfortable with JupyterLab now since it is just on the horizon?

JupyterLab is the evolution on Jupyter notebooks and introduces some much awaited features, however its currently in beta and when I tried it out and found that a lot of the cool extensions that have been built for JupyterLab don’t work well cause of version compatibility issues. So I think, even though JupyterLab is the future, for at-least the near future, I am just gonna stick with classic notebooks for a couple more months.

Jupyter contrib nbextensions

Did you know there are extensions for Jupyter notebook! Well there are, and they are very useful!

Installation

The official github repo has all the detail for installation so I will not repeat those. But if you are lazy to go over that , here is the TLDR version with the basic steps:

pip install jupyter_contrib_nbextensionspip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/masterjupyter contrib nbextension install --user

Also install the extensions configurator for easy toggling of extensions:

pip install jupyter_nbextensions_configuratorjupyter nbextensions_configurator enable --user

You are done!

Refresh Jupyter home page and you should be able to see something like this :

nbextensions tab

Here are some nbextensions that I find useful and short description on what they do :

  1. Table of Contents (2) : This creates a TOC automatically based on the header cells in the notebook. The Display Table of Contents as a sidebar option creates a TOC as a sidebar which helps keeping context and also navigating across the notebook.
Table of contents

2. Collapsible Headings: This along with toc2 is great to roll up sections of the notebook that you don’t want to focus on at a particular time.

Collapsible heading

3. Codefolding : You can reduce the size of the cells by collapsing function definitions and long code blocks.

codefolding

4. ExecuteTime : The %%time cell magic function is great for checking runtime. But with ExecuteTime, the runtime of all the code cells are displayed on the bottom left, so you don’t need remember to use the cell magic.

execute time

5. Gist-it : Great for creating gists on Github. Why gist? You can save different versions of the same notebook at different states during experiments. Also, gists are easier to share and refer back.

6. Hide all Input : For times when you need to code to generate and output, but don’t need to show the code (for brevity) , this is great to clean up a notebook for presentation.

Jupyter Magics functions

There are a lot of magic functions in Jupyter. For the Ipython kernel, here is the link to the docs.

Here are some line magics functions that I have found to be particularly useful :

  1. %cd <path> : Quick way to switch working dir while in notebook.
  2. %autoreload : Use this at every notebook where you use scripts/ projects . The changes you do to them will be auto re-loaded into jupyter context.
%reload_ext autoreload
%autoreload 2
  1. %timeit : Runs the code line multiple times and prints the metrics of code runtime.
  2. %lprun and %mprun : Line profiler and memory profiler for python. Great for optimising code:
line profiler in action

To use line_profiler, install it using pip :

pip install line-profiler
pip install memory_profiler

Then load the extension :

%load_ext line_profiler
%load_ext memory_profiler
  1. ! or %%bash : Bash commands are easy to use on terminal but if you have a sequence of commands that you repeatedly need to run, its so much better to have them in a notebook! The ! before a line will execute the line in bash. e.g. :
!ls -lht

If you use cloud services like AWS, and you have multiple cli commands, putting them in a notebook makes life much easier since you have searchable history of command and the result.

Note: For some reason I have found that using the cell magics %%bashprints the output after the execution is complete. Which is fine for most use cases, but say you are checking docker logs, you might be itchy to check the progress.

Extras

There are a ton of other cool things Jupyter can do, but I can’t go over them all but just a short list:

  1. Jupyter widgets : Jupyter supports widgets as well to support useful interactions and even cool visualisations. The interaction widgets are great for creating interactive notebooks and I suppose they would be greater if you have JupyterHub setup so you can collaborate on notebooks.
  2. Cross language support : It turns out Jupyter is actually an acronym — Julia Python R and there are a ton of kernels for different languages. .
  3. Debugging : There are a lot of debugging options available for Jupyter notebooks and which one is the most productive largely depends on how you write your code . For me, since I put most of the code in .py files, I debug code using the ipython debugger :
from IPython.core.debugger import set_tracedef func(x):
x = x+2
set_trace()
x = x-2
return x
func(2)

This will pop an interactive box to inspect variables in the notebook.

There are easier to use options available too:

The %%pdb cell magic which opens the debugger interface when there is an exception.

Or you can run the %debug line magic immediately after an exception to inspect the stack.

4. Jupyter notebook preview with Vistual Studio Code

So notebooks are great and you can do much more with them when you are more organised. However, versioning of notebooks is still a problem. Meaning you can end up making tens of notebooks for a project and if you want to hunt down some code you had written early on in your project, you will have to open them up one by one or try to remember relevant keywords to grep . Well, I don’t not have a clean solution for the versioning problem other than using naming conventions and taking advantage of folder structures, but, for me it helps if I can preview a notebook without having to start a kernel and wait for it to load (specially if the notebook is large). Visual Studio Code to rescue! This handy text editor is very similar to Sublime Text and has a handy plugin vscode-nbpreviewer which allows you to open notebooks without starting a kernel, which is super fast to load.

Another perk of using Visual Studio Code: spyder like code cells in script , so you can send code from script to editor

Also, it comes as an optional install with Anaconda!

Thats all I have for now. As I come across more things i fin useful, I will keep updating this. Thanks for reading!

--

--