The world’s leading publication for data science, AI, and ML professionals.

Hands-on Time Series Anomaly Detection using Autoencoders, with Python

Here's how to use Autoencoders to detect signals with anomalies in a few lines of codes

Photo by davisuko on Unsplash
Photo by davisuko on Unsplash

Anomalous time series are a very serious business.

If you think about earthquakes, anomalies are the irregular seismic signals of sudden spikes or drops in data that hint that something bad is going on.

In financial data, everyone remembers the Wall Street Crush in 1929, and that was a clear example of a signal with anomaly in the financial domain. In engineering, signals with spikes can represent a reflection of an ultrasound to a wall or a person.

All these stories stem from a very well-defined problem:

If I have a bank of normal signals, and a new signal comes in, how can I detect if that signal is anomalous or not?

Note that this problem is slightly different than the problem of detecting the anomaly in a given signal (which is also a well-known problem to solve). In this case, we assume that we get a whole new signal and we want to know if the signal is sufficiently different than the ones that are considered "normal" in our datasets.

So how would you approach a problem like that?

The powerful Neural Networks give us a solution for the problem, and this solution has been around since 2016. Implementing a Neural Network per se is now a fairly easy game, but understanding how to use NNs for Anomaly Detection can get a little tricky.

The scope of this blog post is to guide the reader towards the idea of anomaly detection with Neural Networks, combining the two subjects in one unique piece of code, from A to Z. We will also do our own anomaly detection case study for Time Series on a synthetic dataset.

Hopefully the intro was interesting enough 🤞. Now let’s get started!

0. The idea

Generally speaking, a good idea to approach a machine learning model is to "pretend to be the computer". The question is "What would you do if you (a human) were to do this task?". Well, we do have our bank of signals right? So it is fair to say that we need to process the signals to find some sort of relevant features and use them as reference(s).

Image made by author
Image made by author

And that is the first step. We now have a set of reference values. These reference values will be the ones that will determine if a new signal is anomalous or not by a simple comparison.

Image made by author
Image made by author

So it works like this:

  • A new signal comes in,
  • We extract the feature of this new signal
  • We compare the features with the features we previously extracted from the bank of data and we check whether that’s an anomaly or not. As simple as that.

Now, the real deal is the following:

"How do we process the bank of data (and the new signal) to extract the features?"

That is really what makes an anomaly detection algorithm different than another one. With not a lot of imagination, the model that extracts the features, in Machine Learning, is usually known as feature extractor.

1. The feature extractor

There are countless feature extractors out there, a lot of them. The one I’d like to talk about today involves Deep Learning.

Deep Learning is (and it has been) the hot thing in tech: everybody talks about it. And for a reason, I’d say. All the magnificent things that we see in our daily news: the new Meta AI LLAma powered chatbot, the circle to search feature powered by Google, the Midjourney outstanding "imagine" feature are all examples of (very successful) Deep Learning architectures.

But Deep Learning can be much simpler than those examples. Even a very simple cat/dog image classifier can be a Deep Learning algorithm. In general, we refer to a Deep Learning algorithm when we talk about an algorithm that learns in a "layered" fashion (that’s why it’s deep) and it allows you to extract features by itself, bypassing the manual feature engineering step. For example, if we distinguish cats and a dogs images, maybe the first layer gets very simple features (like the main color of the image), and as we get deeper into the rabbit hole, the last layer gets very fine details (like the paws in the image).

I don’t want to talk philosophy too much, so let’s talk about our specific Deep Learning algorithm, which is known as encoder-decoder or more specifically, autoencoder.

1. 1 Autoencoders

Our Deep Learning algorithm is a very special one as it aims to reconstruct the same object that you put as an input. Just like this:

Image made by author, doggo's photo by Caleb Fisher on Unsplash
Image made by author, doggo’s photo by Caleb Fisher on Unsplash

Now, this looks a little bit silly right? What is the need for a Machine Learning algorithm that is only trained to replicate what you see? Imagine ChatGPT and imagine if the model just replicates exactly what you write, how is that even helpful?

Well, the trick of Autoencoders is in their architecture. The input of the autoencoder has the number of units (that are nothing but values, real numbers) that is the same as the input size. For example, if the doggo image is 10×10=100 pixels, the input will be 100 units long. The output has the same number of units as the input. Nonetheless, in the middle, the number of units gets smaller. The smaller amount of units is what is called "latent space" and its the space of the meaningful features. You only need k<100 units (the one of the latent space) to reconstruct the full image so they must be the key ones.

Image made by author using the amazing NN-SVG tool
Image made by author using the amazing NN-SVG tool

In other words, the "hidden layer" or "latent space" contains the key information that we can use to replicate the signal (or image or text) that we get as an input.

I can hear you from the computer asking me "Why do we need this at all for anomaly detection?". Well, this is the logic:

If I train a Machine Learning model to replicate a cat given an image of a cat, that means that it will be successful in replicating cat images.

But the contrary is also true: if I give you another image, and it’s not able to replicate it successfully, then that image is probably not the image of a cat -> it’s an anomaly!

More technically, if the error of the non anomalous reconstructed signal is on average, let’s say 0.01 when we find an anomalous signal we will hopefully get an error of 0.10 and as it is 10 times larger than what it should be we can confidently call it an anomaly.

I know it can be confusing, but hopefully it will be much more clear right now as we will do a hands-on example in Python. Let’s start with it!

2. Our hands-on example

In this section we will:

  • Define the Python libraries that we need
  • Build our "normal signals"
  • Build our "anomalous signals"
  • Train our Autoencoder
  • Classify anomalous vs non anomalous signals.

Alright let’s go.

2.1 Python environment

We used numpy for data manipulation (you can also use pandas), matplotlib, seaborn to plot images, **tensorflow* for Deep Learning and sklearn** for error analysis.

*For this very simple exercise, we don’t need a whole lot of control over the Neural Network, as we will use a very simple 1D CNN (spoiler lol). For this reason we can just use tensorflow. If for some reason you are a torch guy, no problems at all! This approach will work just the same

2.2 Data Generator

Let’s first describe how we are going to set up our experiment. We will have our x ranges that will go from -8 pi to 8 pi. A "non anomalous" signal is built as follows:

Image ade by author
Image ade by author

where:

  • A_1, A_2, and A_3 are three random amplitudes that change signal by signal and can be everything from -2 to 2
  • f_1, f_2, and f_3 are three random frequencies that change signal by signal and can be everything from -2 to 2

An anomalous signal is built as follows:

Image made by author
Image made by author

where everything stays the same as a normal signal but:

  • A_anomaly can be everything between -5 to -2 and from 2 to 5
  • f_anomaly can be everything between -10 to -5 and from 5 to 10

In other words:

  • A normal signal is a sum of three sines components with random amplitudes and random frequencies
  • An anomalous signal is a sum of three sines components with random amplitudes and random frequencies and a sine component with larger absolute amplitude and larger absolute frequency

This is how we implement the idea in Python:

The code above generates "num" normal and "num" anomalous signals. We set num=1000, but feel free to increase or decrease it. You can also increase or decrease the number of sine components from 3 (default) to howver you want.

So if we plot them we clearly see the difference* in terms of amplitude and (mostly) in terms of frequency:

2.3 Autoencoders

The Autoencoder that we are using in this project is a 1D CNN. 1D CNNs are used a lot in signals, and the principle is quite simple. There is a small kernel (vector) that runs over the signal with a convolution operation. Just like in 2D CNN, multiple filters can be used to extract the meaningful features of the signals. For example, one filter could look at the maxima, one at their widths, one at the minima, and so on and so forth.

This is how we build our autoencoder:

In order to train our model we need to reshape our vectors. We can do this using this:

So now we can train our model with this:

2.4 Anomaly Detector

Now that we have our model we can try and reconstruct the signals.

We will have the normal signals as input and extract some statistics of our MSE. For example, we can consider the p = 0.99 percentile value. Then we will reconstruct our anomalous signals and we will give a look at the MSE. If the MSE is larger than the one of the 0.99 percentile value, then we call it an anomaly, otherwise, we call it a normal signal.

Just like this:

As we saw all the anomalies have been correctly classified as anomalies and at a "cost" of only 3 normal signals (out of a 1000) that are classified as anomalous.

2.5 All at once!

Now you might think that the normal and anomalous signals are indeed too different, so we can, for example, make them much more similar. I created a function that allows you to regulate the range of frequencies and amplitudes of the real and anomalous signals. You just have to put them in a dictionary like this:

And the whole function to do the same analysis with a custom setup_dict boundary is this one:

And you can just run it like this:

In the default setup_dict above the anomalous frequencies and amplitudes are very similar to the normal ones.

Nonetheless, the performance is very good:

Image made by author using code above
Image made by author using code above

Only 44 anomalies are not detected and only 13 normal signals are wrongly detected as anomalous.

Eventually, when the normal frequencies and the anomalous frequencies are the same or very similar, of course, it would be impossible to distinguish a normal and anomalous signal, but it is interesting to see how far we can push our Autoencoders in distinguishing normal and anomalous signals.

3. Conclusions

Thank you very much for spending time with me. In this article we did the following:

  • Talked about anomaly detection: We described the problem of anomaly detection, specifically in the case of Time Series in multiple scenarios like engineering, finance, and geology
  • Introduced autoencoders: We explained the idea of autoencoders for anomaly detection. We started with the description of a Deep Learning algorithm, we talked about autoencoders and we introduced the idea of using autoencoders for anomaly detection based on the reconstruction error
  • Generated Synthetic Data: We built our synthetic data of sine waves, both with and without anomalies.
  • Implemented the model: A 1D Convolutional Neural Network (CNN) was used to build the Autoencoder, which was trained to replicate normal signals.
  • Made it customizable: We showed that our method worked with our dataset and made a very simple function to allow the users to modify the normal and synthetic datasets and play with it.

4. About me!

Thank you again for your time. It means a lot ❤

My name is Piero Paialunga and I’m this guy here:

Image made by author

I am a Ph.D. candidate at the University of Cincinnati Aerospace Engineering Department and a Machine Learning Engineer for Gen Nine. I talk about AI, and Machine Learning in my blog posts and on Linkedin. If you liked the article and want to know more about machine learning and follow my studies you can:

A. Follow me on Linkedin, where I publish all my stories B. Subscribe to my newsletter. It will keep you updated about new stories and give you the chance to text me to receive all the corrections or doubts you may have. C. Become a referred member, so you won’t have any "maximum number of stories for the month" and you can read whatever I (and thousands of other Machine Learning and Data Science top writers) write about the newest technology available. D. Want to work with me? Check my rates and projects on Upwork!

If you want to ask me questions or start a collaboration, leave a message here or on Linkedin:

[email protected]


Related Articles