Anomaly detection in time series with Prophet library

Insaf Ashrapov
Towards Data Science
3 min readJun 4, 2019

--

First of all, let’s define what is an anomaly in time series. Anomaly detection problem for time series can be formulated as finding outlier data points relative to some standard or usual signal. While there are plenty of anomaly types, we’ll focus only on the most important ones from a business perspective, such as unexpected spikes, drops, trend changes, and level shifts. You can solve this problem in two ways: supervised and unsupervised. While the first approach needs some labeled data, second does not, you need the just raw data. In that article, we will focus on the second approach.

Check out my Machine & Deep Learning blog https://diyago.github.io/

Generally, unsupervised anomaly detection method works this way: you build some generalized simplified version of your data — everything which is outside some boundary by the threshold of this model is called outlier or anomaly. So first all of we will need to fit our data into some model, which hopefully will well describe our data. Here comes prophet library. For sure you can try more or less powerful/complicated models such arima, regression trees, rnn/lstm or even moving averages and etc — but almost all of them need to be tuned, unlikely that will work out of the box. Prophet is not like that, it is like auto arima but much, much better. It is really powerful and easy to start using the library which is focused on forecasting time series. It developed by Facebook’s Core Data Science team. You can read more about the library here.

To install Prophet library can be installed through the pypi:

pip install fbprophet

We will be working with some sample data, which has some interesting outliers. You download data here and it looks this way:

Some random time-series

To train the model we will define basic hyperparameters some interval_width and changepoint_range. They can be used to adjust the width of the boundary:

Fitting prophet model

Then we set as outliers everything higher than the top and lowers the bottom of the model boundary. In addition, the set the importance of outlier as based on how far the dot from the boundary:

Then you are ready to get the plot. I recommend to try altair library, you can easily get an interactive plot just by setting adding .interactive to your code.

Finally, we get the result. They seem to be very reasonable:

--

--