The world’s leading publication for data science, AI, and ML professionals.

How to Skyrocket Your Python Speed with Numba 🚀

Learn how to use Numba Decorators to make your code faster

Quick Introduction

Photo by Marc-Olivier Jodoin on Unsplash
Photo by Marc-Olivier Jodoin on Unsplash

Overview of Your Journey

  1. Setting the Stage
  2. Understanding Numba and Installation
  3. Using the Jit-Decorator
  4. Three Pitfalls
  5. Where Numba Shines
  6. Wrapping Up

1 – Setting the Stage

In data science and Data Engineering, many practitioners write code on a daily basis. Producing a working solution to a problem is definitely the most important thing. However, sometimes execution speed of your code is also important.

This is especially true in real-time analytics and prediction. If the code is too slow, then this might create a bottleneck for the whole system. Often systems get slower as time goes on. This is especially true in data disciplines due to an increase of data to process. At worst, real-time systems you build can be too slow to be useful 😮

Many compiled programming languages, such as C++, are generally faster than Python. Does this mean that you should uproot your whole Python pipeline? No. This is generally not worth the enormous effort it requires.

A different approach is to make your Python code faster. This is where Numba steps in:

Numba is a Python library that aims to increase the speed of your Python code. The aim of Numba is to, at runtime, look through your code and see whether parts of it can be translated into fast machine code.

Sounds intricate, right? It is. However, for the end-user (namely you) using Numba is ridiculously easy. With a few additional lines of Python code, you can get a significant increase in major parts of your codebase. You don’t really need to understand how Numba works under the hood to be able to see results.

In this blog post, I will show you the basics of Numba to get you started. If you need to learn more, then I recommend the Numba Documentation. If you are more of a visual learner, then I have also made a video on the topic:


2 – Understanding Numba and Installation

Let me give you a high-level overview of Numba first 👍

Numba describes itself in the following way:

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. – Numba Documentation

Let’s unpack the above statement. Numba is an open-source and lightweight Python library that tries to make your code faster. The way it does this is to use to industry-standard LLVM compiler library. You do not need to understand the LLVM compiler library to use Numba.

In practice, you will add certain Python decorators to tell Numba that the decorated function in question should be optimized. Then, during runtime, Numba goes through your function and tries to compile parts of it into fast machine code.

The term JIT compilation is an abbreviation for Just-in-time compilation. So rather than compiling the code beforehand (like with e.g. C++), the compilation step happens during the execution of the code. The practical difference? Rather than generating binary files that are cumbersome to share, you are left with only Python files!

Let me show you a code example from Numba’s homepage to demonstrate how easy Numba is to use. The following code is a Monto Carlo Method for approximating the value of pi.

import random

def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

Numba has not been applied to the above code. If the variable nsamples is large, then the function monte_carlo_pi is pretty slow. However, adding the following two lines of code makes it a lot faster:

from numba import jit # <-- importing jit from numba
import random

@jit(nopython=True) # <-- The only difference
def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

That was not that bad, right? 😃

If you are working in Jupyter Notebooks through Anaconda, then run the following command in the Anaconda Prompt to install Numba:

conda install numba

If you are writing your code in an IDE like Visual Studio Code or PyCharm, then maybe you would like to install Numba through PIP:

$ pip install numba

More advanced options, like compiling Numba from source, can be found in the Installation Pages.


3— Using The Jit-Decorator

Now that Numba is installed you can try it out. I’ve made a Python function that does a few NumPy operations:

import numpy
from numba import jit
def numpy_features(matrix: np.array) -> None:
    """Illustrates some common features of NumPy."""
    cosine_trace = 0.0
    for i in range(matrix.shape[0]):
        cosine_trace += np.cos(matrix[i, i])
    matrix = matrix + cosine_trace

Don’t think too much about the above code. The only aim of the function is to use several different features in NumPy like universal functions and broadcasting. Let me time the code above with the following magic command in Jupyter Notebooks:

x = np.arange(1000000).reshape(1000, 1000)
%time numpy_features(x)
Output:
Wall time: 32.3 ms

If you run the code above, you will get slightly different speeds depending on your hardware and other factors. It’s probably not the slowest Python code you have ever seen. However, code like this throughout the codebase really slows down the whole application.

Now let us add the @jit(nopython = true) decorator to the function and see what happens. The code should now look like this:

import numpy
from numba import jit
@jit(nopython=True)
def numpy_features(matrix: np.array) -> None:
    """Illustrates some common features of NumPy."""
    cosine_trace = 0.0
    for i in range(matrix.shape[0]):
        cosine_trace += np.cos(matrix[i, i])
    matrix = matrix + cosine_trace

Not much has changed in how you write the code, but the speed is different. If you again time the code you get the following result:

x = np.arange(1000000).reshape(1000, 1000)
%time numpy_features(x)
Output:
Wall time: 543 ms

What? The code became over 10 times slower than the original code 😧

Don’t be discouraged. Try to run the code again:

x = np.arange(1000000).reshape(1000, 1000)
%time numpy_features(x)
Output:
Wall time: 3.32 ms

Now the code is over 10 times faster than the original code.

What is going on? 😵


4 -Three Pitfalls

The Pitfall of Compilation

The weird thing I showed you is not a bug, it’s a feature. When you run the function with the @jit(nopython=True) decorator for the first time the code slows down. Why?

The first time, Numba has to go through the code in the function and figure out what code to optimize. This adds extra overhead, and thus the function runs slowly. However, every subsequent time the function will be much quicker.

This seems initially like a tradeoff, but not really. In data analysis and data engineering, functions are run a great number of times.

As an example, consider a function that normalizes new data that enters a data pipeline before prediction. In real-time systems, new data is arriving all the time, and the function is used up to hundreds or even thousands of times a minute. The initial slower run is saved within seconds in such systems.

Not Setting the Argument nopython to True

The decorator @jit(nopython=True) can be used without the argument nopython . If you do this, then by default Numba will set nopython=False.

This is not a good idea!

If nopython=False then Numba will not alert you when it can not optimize code. In practice, you will then just add the Numba overhead to your code without any optimization. This slows down your code 😠

If Numba does not manage to optimize your code, then you want to be told. It is better to remove the Numba decorator completely. Hence you should always use the argument @jit(nopython=True) .

Pro Tip: The decorator @njit is shorthand for @jit(nopython=True) and many people use this instead.

Don’t Over-Optimize Your Code

Over-optimizing your code means spending a lot of time on getting optimal performance when it is not needed.

Don’t over-optimize your code!

In many instances, code speed is not that important (e.g. batch handling of moderate amounts of data). Decreasing code speed almost always increases development time. Weigh your options carefully on whether you should incorporate Numba into your codebase.


5 – Where Numba Shines

Photo by Mohamed Nohassi on Unsplash
Photo by Mohamed Nohassi on Unsplash

Pretending that Numba is good at optimizing any type of Python code is not helping anyone. It is only some Python code that Numba is great at optimizing.

Numba is great at optimizing anything that involves loops or NumPy code. Since many machine learning libraries (like Scikit-Learn) heavily use Numpy, this is a great place to use Numba 🔥

However, Numba does not understand e.g. Pandas. Adding the @jit(nopython=True) to a function that purely deals with Pandas dataframes will probably not result in a great performance. See the Numba Documentation for examples.

My advice is the following:

Always check whether the added Numba decorator adds value by testing the speed of the code (after the first compilation step). Don’t sprinkle Numba decorators just for fun, do it to speed up your code when nessesary.


6— Wrapping Up

If you need to learn more about Numba, then check out the Numba Documentation or my YouTube video on Numba.

Like my writing? Check out some of my other posts for more Python content:

If you are interested in data science, Programming, or anything in between, then feel free to add me on LinkedIn and say hi ✋


Related Articles