The world’s leading publication for data science, AI, and ML professionals.

5 Awesome NumPy Functions That Can Save You in a Pinch

Avoid Getting Stuck with 5 Simple Functions

Photo by JESHOOTS.COM on Unsplash
Photo by JESHOOTS.COM on Unsplash

Overview of Your Journey


Setting the Stage

When doing Data Science in Python, the package NumPy is omnipresent. Whether you are developing machine learning models with Scikit-Learn or plotting in Matplotlib, you’re sure to have a few NumPy arrays laying around in your code.

When I started with data science in Python, I had a poor grasp of what could be done with NumPy. Over the years, I have sharpened my NumPy skills and become a better data scientist because of it.

Being good at manipulating Numpy arrays can save your life…or at least an hour of frustrating searching. The five NumPy functions I give you here can help you when things get tough 🔥

Throughout the blog post, I assume you have installed NumPy and have already imported NumPy with the alias np :

import numpy as np

I recommend having seen NumPy previously before reading this blog. If you are completely new to NumPy, then you can check out NumPy’s Beginners Guide or this YouTube video series on NumPy.


1 – Quick Filtering

You can use the where function to quickly filter an array based on a condition. Say you have an audio signal represented as a one-dimensional array:

# Audio Signal (in Hz)
signal = np.array([23, 50, 900, 12, 1100, 10, 2746, 9, 8])

Let’s say that you want to remove everything in signal that has a Hz of less than 20. To efficiently do this in NumPy you can write:

# Filter the signal
filtered_signal = np.where(signal >= 20, signal, 0)
# Print out the result
print(filtered_signal)
>>> np.array([23, 50, 900, 0, 1100, 0, 2746, 0, 0])

The where function takes three arguments:

  • The first argument (in our example signal >= 20 ) gives the condition you want to use for the filtering.
  • The second argument (in our example signal ) specifies what you want to happen when the condition is satisfied.
  • The third argument (in our example 0 ) specifies what you want to happen when the condition is not satisfied.

As a second example, assume you have an array high-pitch indicating whether the pitch of the sounds should be raised:

# Audio Signal (in Hz)
signal = np.array([23, 50, 900, 760, 12])
# Rasing pitch
high_pitch = np.array([True, False, True, True, False])

To raise the pitch of signal whenever the corresponding high-pitch variable says so, you can simply write:

# Creating a high-pitch signal
high_pitch_signal = np.where(high_pitch, signal + 1000, signal)
# Printing out the result
print(high_pitch_signal)
>>> np.array([1023, 50, 1900, 1760, 12])

That was easy 😃


2 – Reshaping Yourself Out of Trouble

Often one has an array with the correct elements, but with the wrong form. More specifically, assume you have the following one-dimensional array:

my_array = np.array([5, 3, 17, 4, 3])
print(my_array.shape)
>>> (5,)

Here you can see that the array is one-dimensional. You want to feed my_array into another function that expects a two-dimensional input? This happens surprisingly often with libraries like Scikit-Learn! To do this, you can use the reshape function:

my_array = np.array([5, 3, 17, 4, 3]).reshape(5, 1)
print(my_array.shape)
>>> (5, 1)

Now my_array is properly two-dimensional. You can think of my_array as a matrix with five rows and a single column.

If you want to go back to my_array being one-dimensional, then you can write:

my_array = my_array.reshape(5)
print(my_array.shape)
>>> (5,)

Pro Tip: As a shorthand, you can use the NumPy function squeeze to remove all dimensions that have length one. Hence you could have used the squeeze function instead of the reshape function above.


3 – Restructuring Your Shape

You will sometimes need to reshuffle the dimensions you already have. An example will make this clear:

Say you have represented an RGB image of size 1280×720 (this is the size of YouTube thumbnails) as a NumPy array called my_image . Your image has the shape (720, 1280, 3) . The number 3 comes from the fact that there are 3 colour channels: red, green, and blue.

How do you rearrange my_image so that the RGB channels populate the first dimension? You can do that easily with the moveaxis function:

restructured = np.moveaxis(my_image, [0, 1, 2], [2, 0, 1])
print(restrctured.shape)
>>> (3, 720, 1280)

With this simple command you have restructured the image. The two lists in moveaxis specify the source and destination positions of the axes.

Pro Tip: NumPy has other functions such as swapaxes and transpose that also deal with restructuring arrays. The moveaxis function is the most general, and the one I use most of the time.

Why is Reshaping and Restructuring Different?

Photo by Priscilla Du Preez on Unsplash
Photo by Priscilla Du Preez on Unsplash

Many people think that reshaping with the reshape function and restructuring with the moveaxis function is the same. Yet, they work in different ways 😦

The best way to see this is with an example: Say that you have the matrix:

matrix = np.array([[1, 2], [3, 4], [5, 6]])
# The matrix looks like this:
1 2
3 4
5 6

If you use the moveaxis function to switch the two axes, then you get:

restructured_matrix = np.moveaxis(matrix, [0, 1], [1, 0])
# The restructured matrix looks like this:
1 3 5
2 4 6

However, if you use the reshape function, then you get:

reshaped_matrix = matrix.reshape(2, 3)
# The reshaped matrix looks like this:
1 2 3
4 5 6

The reshape function simply proceeds row-wise and makes new rows whenever appropriate.


4 – Find Unique Values

The unique function is a sweet utility function for finding the unique elements of an array. Say that you have an array representing the favourite cities of people sampled from a poll:

# Favorite cities
cities = np.array(["Paris", "London", "Vienna", "Paris", "Oslo", "London", "Paris"])

Then you can use the unique function to get the unique values in the array cities :

unique_cities = np.unique(cities)
print(unique_cities)
>>> ['London' 'Oslo' 'Paris' 'Vienna']

Notice that the unique cities are not necessarily in the order they originally appeared in (e.g. Oslo is before Paris).

With polls, it is really common to draw bar charts. In those charts, the categories are the poll options while the height of the bars represent the number of votes each option got. To get that information, you can use the optional argument return_counts as follows:

unique_cities, counts = np.unique(cities, return_counts=True)
print(unique_cities)
>>> ['London' 'Oslo' 'Paris' 'Vienna']
print(counts)
>>> [2 1 3 1]

The unique function saves you from writing a lot of annoying loops 😍

5 – Combine Arrays

Sometimes, you will be working with many arrays at the same time. Then it is often convenient to combine the arrays into a single "master" array. Doing this in NumPy is easy with the concatenate function.

Let’s say that you have two one-dimensional arrays:

array1 = np.arange(10)
array2 = np.arange(10, 20)

Then you can combine them into a longer one-dimensional array with concatenate :

# Need to put the arrays into a tuple
long_array = np.concatenate((array1, array2))
print(long_array)
>>> [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

Combining Our Tools

What if you wanted to stack array1 and array2 on top of each other? You are hence looking to create a two-dimensional vector that looks like this:

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]

You can first reshape array1 and array2 into two-dimensional arrays with the reshape function:

array1 = array1.reshape(10, 1)
array2 = array2.reshape(10, 1)

Now you can use the optional axis parameter in the concatenate function to combine them correctly:

stacked_array = np.concatenate((array1, array2), axis=1)
print(stacked_array)
>>> 
[[ 0 10]
 [ 1 11]
 [ 2 12]
 [ 3 13]
 [ 4 14]
 [ 5 15]
 [ 6 16]
 [ 7 17]
 [ 8 18]
 [ 9 19]]

Almost there…You can now use the moveaxis function to finish the job:

stacked_array = np.moveaxis(stacked_array, [0, 1], [1, 0])
print(stacked_array)
>>> 
[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]
Photo by Japheth Mast on Unsplash
Photo by Japheth Mast on Unsplash

Awesome! I hope this example showed you how some of the different tools you have just learned can come together.


Wrapping Up

You should now feel comfortable using NumPy for a few tricky situations. If you need to learn more about NumPy, then check out the NumPy documentation.

Like my writing? Check out my blog posts Type Hints, Formatting with Black, Underscores in Python, and 5 Dictionary Tips for more Python content. If you are interested in data science, programming, or anything in between, then feel free to add me on LinkedIn and say hi ✋


Related Articles