Interactive Data Science with Jupyter Notebooks

In my videos, you’ve seen me running Python code live on screen and showing the results. Today, I want to share with you how I’ve been doing this, and show how you can take advantage of it too!

The way I’ve been running Python code live on screen is by using a python package called Jupyter. Jupyter is built on the IPython project and allows for interactive python to be run in your browser. But it is so much more than that. From bash commands to special “magics” and plugins, Jupyter enhances the Python coding experience greatly.

If you’re already using Jupyter, I hope I can improve your workflow and show you some new tricks. If you’re not using Jupyter yet, then let’s dive right in.


Installation and Startup

The easiest way to install Jupyter is by running pip install jupyter, though if you use a packaged python distribution such as Anaconda, you may already have it installed. Be sure to activate your Python environment first. Let’s dive in.

When running Jupyter locally, you’ll connect to a locally running webserver through your browser, typically on port 8888. Start your notebook by running jupyter notebook in your working directory. Typically Jupyter will open automatically once it starts up, but if not, point your browser over to localhost:8888.

If you don’t already have a notebook you want to open, you can create one by clicking on ‘New’ and selecting Python 2 or 3, depending on which version of Python you have running in your environment. Jupyter is quite flexible and can actually handle many languages and file types, though we’ll stick with just Python for now.

Running code in Jupyter Notebooks

Once you have a new notebook running, you can write some python code in the empty cell and just hit ctrl+enter to run it. We can run all our typical Python code here, just like you might write in a Python script. The difference is that we can run it and see the results right away!

Notice what happens when we run a cell with ctrl+enter. The brackets on the left side of the cell shows an asterisk when a cell is running or queued to run, and then shows a number once it’s finished, representing the order in which cells were run during a given session, starting at ‘1’.

The results of the final line of a code cell will be printed as an output of that cell, but only if the value is not stored to a variable. So for example, if I import tensorflow and then concatenate it with a string, the output is shown below, even though I didn’t use the print command.

Of course, I can use print() as well. This is all very useful for tinkering and seeing how something behaves.

Shift-Tab

Another fantastic feature of Jupyter notebooks is the ability to show the docstring of a function you are calling by pressing shift-tab. This allows you to call a function with the correct arguments without needing to look up the full documentation every time.

This feature also works with your own local custom functions, so if you write good docstrings, you will be rewarded!

Outputs

When you have a lot of output, you can reduce the amount of space it takes up by clicking on the left-hand panel of the output, which turns it into a scrolling window. If you double-click, the output will be collapsed entirely.

More cells!

One cell is useful, but really we want to have many cells. To add a cell, click the ‘plus’ icon on the toolbar. There are also some cell execution commands which can lead to new cells being created.

If you press shift+enter, it will run the current cell and then highlight the next cell. If there is no new cell, then a new cell is created. On the other hand, if you want to create a new cell immediately after a given cell, you can use alt-enter to execute the cell and then insert a new cell directly after it.

Did someone say Markdown?

Perhaps the biggest feature that I’ve left out so far is the markdown support. My first impression of Jupyter notebooks was its ability to provide a great way to both write code and describe the code I was writing. The rich semantics of markdown allow for researchers and educators alike to easily and cleanly communicate thoughts and ideas.

And perhaps most importantly, it allows past-you to tell future-you what a given code cell was supposed to do, in a way that can be much more expressive than using comment blocks!

Jupyter Magics

Sometimes I just want to do a quick check on how long one training or evaluation cycle is taking to execute. For an easy way to time your code, start a cell with %%time and once the cell finishes executing it will print out how long it took to run that cell. It’s not precision atomic timing, but it’s a great way to get some solid first impressions with very minimal effort.

If you want to run a command line command in a notebook, the easiest way is to put an exclamation point in front of the command. This is most useful for a one-off command.

If you want to run a bunch of commands, start a cell with ‘%%bash’ to cause the entire cell to be interpreted as a bash script.

A great use of this is to kick off TensorBoard. Typically running TensorBoard might involve starting a new terminal window and running it on the command-line, which is what we would typically do if you want to run it for a while. But if you just want to spin it up, take a peek, and close it down, having it in a Jupyter notebook cell isn’t such a bad idea.

Plus, you’ll never forget to run it, since it’s embedded into your workflow of notebook cells! Notice that it will occupy your notebook. So you won’t be able to run anything else while Tensorboard is running. To stop it, click interrupt kernel, the asterisk will go away, and you’ll get back control flow.

So there you have it, some of my favorite Jupyter features and capabilities. This is of course not a comprehensive discussion of Jupyter’s features. I’ve just covered some of my favorite and most frequently used here. There are many, many more waiting for you to explore.


Thanks for reading this episode of Cloud AI Adventures. If you’re enjoying the series, please let me know by clapping for the article. If you want more machine learning action, be sure to follow me on Medium or subscribe to the YouTube channel to catch future episodes as they come out. More episodes coming at you soon!

P.S. if you’re still reading: How were the gifs in this post? Were they helpful? Distracting? Nifty? Confusing? Let me know!