The world’s leading publication for data science, AI, and ML professionals.

How To Compute Euclidean Distance in NumPy

Showcasing how to calculate euclidean distances in NumPy arrays

Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash

Introduction

Euclidean distance between two points corresponds to the length of a line segment between the two points. **** Assuming that we have two points A (x₁, y₁) and B (x₂, y₂), the Euclidean distance between the points is illustrated in the diagram below.

Euclidean Distance between two points - Source: Author
Euclidean Distance between two points – Source: Author

The mathematical formula used to compute the euclidean distance between two points, is given below.

d = √((x₂ – x₁)² + (y₂ – y₁)²)

In today’s short tutorial we will explore a few different ways in which you can compute the Euclidean Distance when working with NumPy arrays. More specifically, we will showcase how to do so using

  • linalg.nrom() method
  • scipy package
  • and a combination of sqrt() and einsum() methods

First, let’s create an example NumPy array that we will be referencing in the following sections in order to demonstrate a few different ways for computing Euclidean Distance.

import numpy as np
a = np.array((1, 2, 3))
b = np.array((4, 5, 6))
print(a)
array([1, 2, 3])
print(b)
array([4, 5, 6])

Computing Euclidean Distance using linalg.norm()

The first option we have when it comes to computing Euclidean distance is [numpy.linalg.norm()](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function, that is used to return one of eight different matrix norms.

The Euclidean Distance is actually the l2 norm and by default, numpy.linalg.norm() function computes the second norm (see argument ord).

Therefore, in order to compute the Euclidean Distance we can simply pass the difference of the two NumPy arrays to this function:

euclidean_distance = np.linalg.norm(a - b)
print(euclidean_distance)
5.196152422706632

Computing Euclidean Distance using SciPy

The Scipy package offers a module with numerous functions that compute various types of distance metrics, including Euclidean Distance. More specifically, the scipy.spatial.distance.euclidean function can compute the Euclidean Distance between two 1-D arrays.

from scipy.spatial.distance import euclidean
euclidean_distance = euclidean(a, b)
print(euclidean_distance)
5.196152422706632

Writing our own function

Finally, another alternative (a quite obvious one I’d say) is to simply write our very own function that would be able to compute the Euclidean Distance between two input arrays.

Now let’s revisit the mathematical definition of Euclidean Distance that we discussed already in the beginning of the article.

d = √((x₂ – x₁)² + (y₂ – y₁)²)

The function shared below will do exactly this:

def compute_euclidean(x, y):
    return np.sqrt(np.sum((x-y)**2))

And finally, let’s ensure that the result is identical to the two aforementioned approaches:

euclidean_distance = compute_euclidean_distance(a, b)
print(euclidean_distance)
5.196152422706632

Final Thoughts

In today’s article we discussed about Euclidean Distance and how it can be computed when working with NumPy arrays and Python. More specifically, we showcased how to calculate it using three different approaches; the linalg.nrom() method, a combination of sqrt() and einsum() methods and using the scipy package.


Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.

Join Medium with my referral link – Giorgos Myrianthous


Related articles you may also like

How To Merge Pandas DataFrames


What Does random.seed Do in NumPy


How to Normalize a NumPy Array to a Unit Vector


Related Articles