Getting Started

Magic Commands for Profiling in Jupyter Notebook

Assess the time and memory complexity of your code within your notebook

Remi Perrier
Towards Data Science
6 min readJan 26, 2021

--

Photo by Agê Barros on Unsplash

Jupyter Notebooks offers dynamic interaction with Python and allows us to create documents mixing code, text, images, and much more. Notebooks are powered by IPython which provides interactive computing with Python and extends its capabilities in many ways. One of them is the addition of Magic Commands.

Introduction to Magic Commands

Magic Commands are succinct solutions to common obstacles. You probably encounter them already, maybe without knowing. They can be identified by their prefix % or %%.

Line Magics VS Cell Magics

There are two kinds of magic commands: Line Magics (% prefix) and Cell Magics (%% prefix). Line magics operate only on their line when Cell Magics operate on their full cell. To work, cell magics must be on the cell first line, even before comments!

Let’s take an example to see magic commands at work. One of the easiest ones is %time, it measures the execution time of a statement.

By prefixing a statement with %time, we indicate to IPython that we want to know the execution time of this line, the result is printed in the output. Like I said previously, line magics operate on their line so if the line is run multiple times line magic too.

It’s not great to get total execution time. However, there is an easy solution: %time is also available as cell magic. Let’s try again using %%time.

By prefixing the cell with a cell magic IPython know that it musts considers everything inside.

To learn more about a specific magic command no need to search the web, everything is available within the Jupyter notebook. Indeed, you can use %your_magic_command? to access command documentation.

Magic Commands for Profiling

There is plenty of magic commands, all useful for different scenarios. Let’s explore some of them dedicated to profiling.

%time

The first one is %time, like showed previously it measures execution time. It’s great to know how many time a cell needed to run. However, measure from %time shouldn’t be taken as the average run time, indeed the measure is based on only one run and can vary from one to another.

%timeit

For more accuracy, %timeit is the solution. It will run code multiple times to compute mean execution time and its variance.

You can see in output 7 runs and 100000 loops each, since our statement needs very little time to execute IPython will execute it 100000 times and divide total time to get a more precise measure, then this process is repeated 7 times to compute average time and variance.

IPython will automatically adjust these values based on execution time, fast code will be run a lot of times, slower code fewer times. However, you can also tell %timeit how many runs you wish. Indeed, you can pass arguments to magic commands!

Here using -r and -n we precise that we want 20 runs executing the statement 100 times each.

What’s more, %timeit can also be used as a magic cell. It will then measure the time execution of the whole cell instead of a single statement.

%prun

A program is composed of numerous statements interacting with each other. You can use %timeit to assess that all your elemental code blocks are fast but when running the full script one of them can call another one a tremendous amount of time leading to long execution time. To assess time performance at the program scale you need %prun.

To have a better insight about %prun let’s define a small function.

We can see that %prun broken down our program into each of its sub-components. For each function we have :

  • ncalls: Number of times the function was called
  • tottime: Total time passed in the function itself.
  • percall: Average time needed by function call (tottime/ncalls).
  • cumtime: Total time passed inside the function and called sub-functions.

In this script, random is called 10000 times for a total execution time of 0.095s. Since random is called within a list comprehension we can see that list comprehension has a higher cumtime than tottime.

With a more complex program using numerous functions, it becomes difficult to read %prun output. For easier comprehension, the output can be sorted on a parameter passed through -s.

In the same way that previously %prun can be used as cell magic when prefixed by %%.

%lprun

Like %prun, %lprun allows to analyse time consumption of a whole program. However, it’s even more precise since it assess execution time for every lines.

This magic command is not present by default in IPython and need a bit of installation.

Depending on your installation you may need to use pip instead of pip3

The first line download the module containing %lprun. The second line imports it. Since magic commands are special commands they are imported with a special command instead of import. %load_ext is the magic command used to import new magic commands!

Note: ! is the abbreviation of %system, the magic command to run terminal command. Remember when I said that you’ve probably already use magic command without knowing ?

Since a program is composed of numerous lines %lprun will not track them all. We need to specify using -f which functions it must assess. Multiples functions can be tracked by using multiple -f tags.

Here we can see the result of the study, for each code line we have :

  • Hits: Number of times were the line is executed.
  • Time: Total time used by the line.
  • Per Hit: Average time needed by the line(Time/Hits).
  • % Time: Percentage of total tracked time used by this line.

Here the function is called once and doesn’t contains any loop so each line is hit once. Like we previously we can see that the matrix creation takes most of the time.

%memeit

Here comes the two functions for memory profiling: %memeit and %mprun. They are the memory version of %time and %prun. These functions are not present by default in IPython, we need to install and load the memory profiler package to use them.

Depending on your installation you may need to use pip instead of pip3

Then, we can use it like %time and %timeit, as a line magic or cell magic.

Here, we can see that create_and_sum_matrix, increased the total used memory of the system by 35MiB up to 88MiB.

%mprun

%mprun is the memory version of %lprun. It allows us to track memory consumption line by line within given functions. However, %mprun has a constraint: it can’t work with functions defined in the notebook.

To use it, you need to store your function in a Python file and import it. Gladly, you can create a Python file directly from your notebook using the cell magic %%file. Magic commands truly have a solution for everything.

By prefixing the cell like this, all its code will be saved in my_file.py. Now, we can import it and use %mprun.

We can see how memory is allocated and freed during the function. In the beginning, 55.4MiB is already used by the environment. Creating the matrix required 4MiB. However, deleting it only freed 3.2MiB! This is due to Python memory manager policies. If you want to learn more about it, check this.

This shows why it’s so important to check memory complexity too. Memory used can be higher than what we think.

Conclusion

Assessing time and memory complexity is essential to forecast the resource consumption of an application. This can be done inside a notebook using magic commands. Magic commands implement solutions to common problems. In this article, we introduced six of them, to measure time and memory consumption at different scales.

To go further

To learn more about Magic Commands:

  • %magic: The magic command to learn more about magic commands.
  • Documentation: Online documentation for magic commands.

To learn more about IPython:

References

[1] IPython Documention on Magic Commands

[2] J. VanderPlas, Python Data Science Handbook, 2016

About me

Hey! I’m Rémi, a final year student in Computer Science. I became passionate about Data Science 2 years ago, since then I spend my time learning and practicing it. Feel free to connect with me on LinkedIn.

--

--

Hey! I’m Remi, a final year student in Computer Science. Feel free to connect with me on LinkedIn.