The world’s leading publication for data science, AI, and ML professionals.

Measuring similarity in two images using Python

Learn how to implement various similarity metrics in Python in just a few lines of code.

Photo by Jørgen Håland on Unsplash
Photo by Jørgen Håland on Unsplash

For the human eye it is easy to tell how similar in quality two given images are. For example, in the various types of spatial noise shown in the grid below it is easy for us to compare them with the original image and point out the perturbations and irregularities. However, if one wanted to quantify this difference we’ll need mathematical expressions.

Different types of simple noise in an image. Image by author.
Different types of simple noise in an image. Image by author.

In this article we’ll see how to implement the following similarity metrics each using a single line of code:

  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
  • Peak Signal-to-Noise Ratio (PSNR)
  • Structural Similarity Index (SSIM)
  • Universal Quality Image Index (UQI)
  • Multi-scale Structural Similarity Index (MS-SSIM)
  • Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS)
  • Spatial Correlation Coefficient (SCC)
  • Relative Average Spectral Error (RASE)
  • Spectral Angle Mapper (SAM)
  • Visual Information Fidelity (VIF)

The sewar library can be used to implement all of these metrics (and a few more).

Start by installing sewar:

pip install sewar

and then importing the necessary modules,

These modules are easy to use and can be directly called as shown below.

For each of the noisy methods we can see the similarity results below. The "Original" column shows the score after comparing the original image with itself in order to see the ideal score.

Scores from the similarity metrics for different types of noising methods
Scores from the similarity metrics for different types of noising methods

The values for each noising method corresponds with the intuition gained visually from the image grid above. For instance, the noise added by S&P (Salt and Pepper) and Poisson methods are not easily visible to the naked eye. However, we can spot them on close observation of the images. In the similarity scores, we can see that S&P and Poisson show values closer to the ideal value when compared to other noising methods. Similar observations can be made from other noising methods and metrics as well.

From the results it seems that ERGAS, MSE, SAM, and VIFP can be sensitive enough to capture the added noise and return an amplified score.

But where can this simple quantification be useful?

The most common application comes when comparing a regenerated or reconstructed image with its original, clean version. GANs have been famously denoising and cleaning images quite well recently — and these metrics can be used to measure how well the model has actually reconstructed the image beyond just visual observation. Using these similarity metrics to evaluate the regeneration quality of a large batch of generated images can reduce the manual work in evaluating a model visually.

Moreover, it has been observed that similarity metrics can also be used to highlight the presence of an adversarial attack in an image when compared with its benign counterpart. Thus, these scores can be a measure to quantify the amount of perturbations brought in by these attacks.

Let’s discuss in the comments more interesting ways in which these image similarity metrics can be used in real life!

Please follow and support a fellow AI enthusiast!

Thank you for reading all the way through! You can reach out to me on LinkedIn for any messages, thoughts, or suggestions.


Related Articles