The world’s leading publication for data science, AI, and ML professionals.

Fast Anomaly Detection in Images With Python

How I improved speed by 10x

photo by Saffu on Unsplash
photo by Saffu on Unsplash

Recently, I’ve been working on an anomaly detection model for a project at work. I needed to improve the speed of my code by 10x.

In this post, I’ll describe how I achieved this goal and some of the challenges I encountered along the way.

Roughly speaking, anomaly detection techniques try to identify patterns in data that do not conform to typical behavior. This can be used to identify problems in a system, fraudulent behavior, or other unusual activities. In my case, we needed to identify anomalies in real-time from video images.

A typical approach would be to build a reference probability distribution out of anomaly-free images and compute the distance between new images and the reference distribution to find out whether or not we are facing an outlier.

The problem with this method is that we assume that the reference distribution doesn’t change over time. In addition, we also need to obtain data for the entire range of values that covers all possible outcomes that have not been subject to anomalies.

Fortunately, there is a method to tackle this problem: create a sliding window and use unsupervised anomaly detection methods.

In our case, we could create a reference distribution out of the last 200 photos taken by the camera and compare this distribution with the new incoming pictures.

But this method has one major issue:

We need to train the algorithm every time we update the sliding window.

This means that the algorithm should be fast and that the feature vector that we extract from the pictures should be small–as little as 10 dimensions in our case–if we want to have a chance to train a model without spending more than a couple of milliseconds on an edge device.

Wavelet decomposition

Wavelet is a signal processing technique that can be used to decompose a signal into different frequency components. It can be used for image processing by decomposing an image into different frequency bands.

To get started with wavelet transforms in Python, we can use a library called PyWavelets.

Wavelets can also be used to compress images, by identifying and removing unnecessary details. By doing so, wavelet compression can result in much smaller file sizes without compromising on quality. In addition, wavelets can be used for a variety of other tasks such as deblurring, denoising, and edge detection.

We don’t need wavelet decomposition to find anomalies in images, but this technique can be used in the industry for two main reasons:

  • compress images to save space in the database and speed up the transfer of files over a network, and
  • isolate key features in the image based on their frequency and orientation in the image.

Lesson learned number 1: you can win some precious milliseconds by extracting features from the wavelet decomposition itselft. This way, we don’t need to recompose the picture back to its initial form.

For instance, the picture below shows the first level decomposition:

  • with the low-resolution image on the top left,
  • the vertical details on the top right,
  • the horizontal details on the bottom left, and
  • the diagonal detail on the bottom right.
wavelet transform example with one decomposition level (decomposition of the original photo by Saffu on Unsplash)
wavelet transform example with one decomposition level (decomposition of the original photo by Saffu on Unsplash)

And we can have several decomposition steps. For illustration, the second decomposition step would look like the following figure:

wavelet transform example with two decomposition levels (decomposition of the original photo by Saffu on Unsplash)
wavelet transform example with two decomposition levels (decomposition of the original photo by Saffu on Unsplash)

The edges are typically a good feature to extract when looking for anomalies. This information can mostly be taken from the horizontal and vertical details.

Lesson learned number 2: for edge detection, the diagonal details are mostly noise.

To extract edges, we followed the method presented in "A Low Redundancy Wavelet Entropy Edge Detection Algorithm." [1] as following:

  1. Combine and normalize the horizontal and vertical components together for each level decomposition. For example with the first level decomposition, we would create a new picture by summing the top right image and the bottom left image.
  2. Normalize the newly created picture from 0 to 1 with min-max normalization.
  3. Select the decomposition level with the most structure. This is assumed to be the level with the lowest Shannon entropy (there is even a scikit module to compute that). This way, we select the level that has the most relevant information for our purpose and we avoid losing time on redundant information from other levels.

Lesson learned number 3: there is redundant information between the decomposition steps so we only need to extract features from one of them.

Feature descriptor

The feature descriptor in our is a representation of an image or video that can be used for tasks such as object detection and classification. There are many different types of feature descriptors, but they all aim to capture the important characteristics of an image or video in a compact way.

Commonly used descriptors include histograms of oriented gradients (HOG), histograms of intensity (HIST), and scale-invariant feature transform (SIFT).

Each type of descriptor has its own strengths and weaknesses, and choosing the right descriptor for a particular application is critical. In general, however, the goal of all feature descriptors is to represent an image or video in a way that is beneficial for machine learning.

In our case, we were using the HOG descriptor, but it turned out to have two downsides to detecting anomalies in real-time:

  • the HOG descriptor was too high-dimensional, and
  • the computation was too slow.

For real-time processing, we needed something:

  • fast to extract,
  • low dimensional, and
  • that varies significantly in case of an anomaly.

For our specific project, the position and the shape of the object were altered in case of an anomaly. Meaning that we just needed to build a feature vector that reflects the shape of the object.

The image moments were enough to reflect those changes.

We found out that a simple feature vector with the area (sum of grey level), the centroid coordinates (the center point of the object), and the second moment (the variance) were already enough to detect most of the possible anomalies.

Lesson learned number 4: a simple feature extractor is more efficient than a fancy one when built specificaly for a project.

Real-time anomaly detection in Python

Anomaly Detection is the process of identifying unusual behavior or events in data. It is a crucial part of many systems, from security and fraud detection to healthcare and manufacturing.

Real-time anomaly detection is a particularly difficult problem because it requires near-instantaneous identification of anomalies which is even more challenging when dealing with high-dimensional data such as images.

There are two libraries that I like for anomaly detection:

  • The first one is called PyOD. It’s a Python toolkit to implement unsupervised anomaly detection algorithms, and
  • the second is called PySAD–which can be combined with PyOD–to detect anomalies in streaming data.

Both of these libraries are open-source, lightweight, and easy to install.

The tricky part is not to implement the algorithm–as PyOD does it for us–but to select the right algorithm.

In our case, we were first using an algorithm called isolation forest (iForest) but switching to another method called Histogram-based Outlier Score (HBOS) significantly improved the speed while almost keeping the same accuracy.

We were also lucky to find out that they just released a paper called "ADBench: Anomaly Detection Benchmark" [2] that benchmarks all their anomaly detection algorithms in a very comprehensive way.

Lesson learned number 5: you don’t necessarly loose a lot of accuracy when choosing a simpler algorithm, but the speed gain can be significant.

Conclusion

So, if you’re looking for a quick and efficient way to extract features from an image, wavelet decomposition combined with simple metrics such as the image moments might be a great option. And while there are more sophisticated feature extraction algorithms out there, sometimes a simple approach is all you need.

We’ve seen that extracting features from a wavelet decomposition can be beneficial for edge detection but that diagonal details are mostly noise. Additionally, we’ve found that there is redundant information between the decomposition steps so the feature extractor can be based on only one of the decomposition steps. Preferably the level with lower entropy as it is the one with more edge structures within the image.

In the end, we found that a simple feature extractor is more efficient than a fancy one when built specifically for our project. We also found that you don’t necessarily lose accuracy when choosing a simpler algorithm, but the speed gain can be significant.


Curious to learn more about Anthony’s work and projects? Follow him on Medium, LinkedIn, and Twitter.

Need a technical writer? Send your request to https://amigocci.io.


[1]: Tao, Yiting, et al. "A Low Redundancy Wavelet Entropy Edge Detection Algorithm." Journal of Imaging 7.9 (2021): 188.

[2]: Han, Songqiao, et al. "ADBench: Anomaly Detection Benchmark."


Related Articles