The world’s leading publication for data science, AI, and ML professionals.

Making Numba Accessible for Faster Processing with Python

Overcome the limitations of Numba and use it in an intelligent way

Photo by Chris Ried on Unsplash
Photo by Chris Ried on Unsplash

Numba is a just-in-time compiler for Python to speed up the code with computationally intensive calculations and functions such as loops over NumPy arrays.

Numba compiled algorithms may make the runtime of the Python codes up to a million times faster and thus may reach the speed of C. In addition, with the increasing number of operations, the computation time is usually significantly faster than Cython, the other compiler used for faster processing. Here is a reference article¹ that shows a precise comparison among Python, Cython, and Numba processing. It is also incredibly easy to use Numba and one needs to just apply Numba decorators (@njit) to the Python function.

Limitations of Numba and Ways to Overcome

However, Numba has certain limitations. While it works best on NumPy arrays, it may not be accessible by pandas objects or list of the list arguments, some of the NumPy functions such as numpy.concatenate or numpy.diff or most importantly by some popular python libraries, for example, Scipy. In those cases, refactorization of the code in an intelligent way using both Numba and Cython/Python functions could help a lot for faster processing.

In this article, I will show you an example of refactorization using Numba and Cython/Python that is aimed to generate a peak mask from any 1D signal with the help of Scipy functions. Here I have used mass spectrum data as an example dataset, which shows intensity vs mass/charge of peptides. If you want to gain more insights into such data, the reference article by (Zhang et al., 2009)² may help you. Using Cython/Python function for the peak detection with Scipy libraries and then using Numba for the rest of the calculations turn out to be a highly optimised and time-efficient way for this task.

There are some Numba-alternatives to some of the Scipy functions, such as NumbaQuadpack for scipy.integrate.quad and NumbaMinpack for scipy.optimize.root. Still, Numba can’t deal with most of the Scipy functions. For those, splitting up the total processing between Cython and Numba are recommended, since,

· Both are shown to significantly speed up Python code

· Scipy codes can be compiled with Cython

· Numba speeds up more than Cython with increasing number of operations.

Here is an example of a simple piece of code written in Jupyter notebook, which detects the peaks from a signal (_data_slice) by using find_peaks and peak_prominences functions of Scipy.signal. Inside a Cythonized cell, types for each variable are manually added and finally the numpy version of the intensity values (int_array) for the whole length of the signal, peak points as peaks_array, left and right base points for the detected peaks as left_bases_array and right_bases_array respectively are passed on to the Numba function for the remaining calculations.

Cython function

%load_ext Cython
%%cython 
from scipy.signal import find_peaks, peak_prominences
cimport numpy as cnp
cnp.import_array()
import numpy as np
cpdef peak_finder_cy(cnp.ndarray _data_slice, _scan_no, int int_index = 0):
    cdef cnp.ndarray _int
    cdef cnp.ndarray _peaks
    cdef cnp.ndarray _prominences
    cdef cnp.ndarray peak_mask
    cdef int wlen
    cdef cnp.ndarray int_array
    cdef cnp.ndarray left_bases_array
    cdef cnp.ndarray right_bases_array
    cdef cnp.ndarray peaks_list_array
    _int = _ms1_data_slice[_data_slice[:, 4] == _scan_no, int_index]
    _peaks, _ = find_peaks(_int)
    prominences, left_bases, right_bases = peak_prominences(_int,   _peaks, wlen=20)         

    int_array = np.array(_int)
    peaks_array = np.array(_peaks)    
    left_bases_array = np.array(left_bases)
    right_bases_array = np.array(right_bases)
    return int_array, peaks_array, left_bases_array,        right_bases_array

Numba function

import numba as nb
@nb.njit()
def peak_mask_finder_cy(_int_list, _peaks_list, _left_bases_list, _right_bases_list):   
    peak_id = 1
    peak_mask = np.zeros(_int_list)
    j = 0
    for j in nb.prange(_peaks_list):                
        if j > 0 and _left_bases_list[j] < _right_bases_list[j-1] and _int_list[_peaks_list[j]] > _int_list[_peaks_list[j-1]]:
            _left_bases_list[j] = _right_bases_list[j-1]             

        peak_mask[_left_bases_list[j] + 1 : _right_bases_list[j]] = peak_id
        peak_id += 1
    return peak_mask

Call the respective functions

%%timeit
int_list, peaks_list, left_bases_list, right_bases_list = peak_finder_cy(signal_data, scan_no)
peak_mask_final = peak_mask_finder_cy(int_list, peaks_list, left_bases_list, right_bases_list)

Processing time with Cython and Numba

39.2 µs ± 120 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

With Cython only

101 µs ± 178 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Thus, even for a small chunk of data, the improvement in processing time using the refactorized code with Cython and Numba is significant in compared to only Cython based codes. With the increase in the number of detected peaks from larger volume dataset, the processing time difference between the two is likely to be more prominent and significant.

Here to mention that Cython codes must be in a .pyx file for developing the processing pipeline and a separate setup.py script is required to compile it by Cython to a .c file and the .c file is compiled by a C compiler to a .so file. The command to be used here is: python setup.py build_ext – inplace. Then the Cython module can be imported in a usual way inside any python script.

One may argue of using Python codes straightway instead of using Cythonized version of those for the first function (peak_finder_cy). For the libraries like Scipy which calls compiled C and thus is already a high-performance library, it can be a reasonable argument. However, Cython can be proven highly effective when there is a larger number of scans/data segments to loop over to find the peaks. This article here³ can help you with some more details and examples in this context. As a reference, the processing time with Python and Numba for the above piece of code was 40.7 µs ± 91.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each).

Conclusion

In conclusion, refactoring is the most viable option to make Numba accessible for computations with incompatible Python libraries and functions and to ensure the best runtime of the code simultaneously.

References

  1. https://www.pickupbrain.com/python/speed-up-python-up-to-1-million-times-cython-vs-numba/
  2. Zhang, J., Gonzalez, E., Hestilow, T., Haskins, W., & Huang, Y. (2009). Review of peak detection algorithms in liquid-chromatography-mass spectrometry. Current genomics, 10(6), 388.
  3. http://stephanhoyer.com/2015/04/09/numba-vs-cython-how-to-choose/

Thanks, Joseph Bloom for his very useful feedback on this article, from a reader’s point of view.


Related Articles