Time series forecasting sucks. It’s cumbersome and requires both subject matter and technical expertise. That is, until now.

In 2020, researchers at Standford and Facebook retooled the Prophet algorithm to include a deep learning component. The main selling point is that accuracy improvements were between 55–92%. The deep learning portion of the model is built on top of PyTorch, so they’re easily extendable. Run time on average increased by about 4x, but time series forecasts are rarely in real time, so run time isn’t a major issue.

If you need an interpretable yet powerful time series forecast, NeuralProphet might be your best option.

Let’s dive in.

NeuralProphet is a deep learning extension of the original Prophet library. The GAM-structure of the model remains unchanged and we simply include several deep learning terms. Those terms are for lagged covariates, future (forecasted) covariates, and autoregression. There are three neural network configurations with increasing complexity described below.

Ok, let’s slow down a bit. We’re going to start from square one and assume you don’t know anything about Facebook Prophet.

The initial Facebook Prophet algorithm (2017) is very lightweight yet effective time series forecasting model. It was built to be easy to use and interpretable, descriptions that are rarely associated with time series modeling.

According to the original paper, the model succeeded because the researchers reframed time series forecasting as a **curve-fitting** problem instead of an **autoregressive** one. Many prior models, such as ARIMA, lagged and fit data instead of trying to find the functional form of our trend.

The model has three main components as shown in figure 2. *T(t)* corresponds to the trend of our time series after seasonality has been removed. *S(t)* corresponds to our seasonality, whether it be weekly, monthly, or yearly. And finally, *E(t)* corresponds to pre-specified events and holidays.

There is a fitting process for each of these components and, once fit, they often combine to produce a reliable forecast.

For a more visual representation of these components, here’s decomposition plot from Prophet’s documentation.

Now that we have some foundation on the predecessor of NeuralProphet, let’s move on.

NeuralProphet adds three components to our original framework, as shown on the second line of figure 4.

The first three terms remain mostly unchanged between the two models. The final three are deep learning terms that differentiate the new model from the old. Let’s take a look at each one in turn.

**2.1 — Trend T(t)**

Trend is unchanged from the prior Prophet model. In short, we look to model trend using either exponential or linear growth functions. Below we take a look at exponential growth (figure 5):

Using logistic growth is a very traditional and well-accepted solution, however where the original Prophet model innovates is that **it allows for the parameters of the function to change**. These *change points* are dynamically determined by the model and give a lot more freedom to other static growth rate and offset parameters.

**2.2 — Seasonality S(t)**

Seasonality is defined to be variations that occur at specific regular intervals. It’s notoriously difficult to account for because it can take so many forms.

The initial developers of the model came up with another great idea — instead of trying to model seasonality with autoregression i.e. lagging data, they tried to model seasonality’s curve. And that’s where Fourier Series come in.

A Fourier Series is just a bunch of sinusoids summed together, which can be used to fit any curve. Once we have the functional form of our data’s daily, weekly, monthly, etc. seasonality, we can simply add those terms to our model and accurately forecast future seasonality.

**2.3 — Events E(t)**

For the final term that was in the initial Prophet model was used to handle events.

Seasonality and events are handled nearly identically — with Fourier Series. However, instead of a smooth curve, we expect our Fourier transform to produce a very spiky curve given a specific holiday. And, because the underlying functions are sinusoidal, they are easily extended into the future.

Now let’s move on to the new model.

**2.4 — Regressors F(t), L(t)**

One of the powerful aspects of the Prophet and NeuralProphet models is that they allow for covariates. Most time series forecasting models are univariate, although they do sometimes offer multivariate versions — ARIMA vs MARIMA.

When handling covariates with time series forecasts, we need to ensure that those covariates will be present *n* time periods in advance, or else our model has nothing to forecast with. We can do this by lagging current covariates by *n *time periods, which is modeled by the *L(t)* term, or developing a forecast for those covariates, which is modeled by the *F(t)* term.

Once we have our respective covariates, we then throw deep learning at it (section 3).

**2.5— Auto-Regression A(t)**

Finally, autoregression is the concept of looking back on prior values and using those as predictors for future values. The original prophet model was so effective because it steered away from this philosophy, however to leverage deep learning we must return.

The autoregression term uses lagged values to predict future values. In practice we rarely use covariates so this is where most of Neural Prophet’s power comes from.

With that structure, let’s zoom in on the deep learning models used by NeuralProphet.

NeuralProphet is built on top of PyTorch and AR-Net, so its modules are easily customizable and extendable.

There are a few configurations. The first is **Linear AR**, which is just a single layer neural network (NN) with no biases or activation functions. It’s very lightweight and regresses a particular lag onto a particular forecast step, which makes interpreting the model quite easy.

**Deep AR** is a fully connected NN with a specified number of hidden layers and ReLU activation functions. With the increase in complexity between Linear AR and Deep AR, there is greater train time and you lose interpretability. However, often you see improvements in forecasting accuracy. It’s also important to note that you can approximate the information from the weights in Linear AR with the sums of the absolute weights of the first layer for each input position. It’s not perfect, but it’s better than nothing.

**Sparse AR **is an extension of deep AR. For the autoregressive piece, it’s often best to have an AR of high order (more values at prior time steps) because we can add a regularization term. By adding more data and automatically removing its importance during fitting, we are more likely to find signal.

Any of the above three methods can be implemented with both covariates and auto-regressed values.

And there you have it, NeuralProphet it all it’s glory!

To hammer home the concepts, we’re going to quickly summarize.

NeuralProphet is a deep learning extension of Facebook Prophet. It adds on to the prior model by including deep learning terms on both covariates and data in the time series.

The initial model (Prophet) leverages curve-fitting, which was a novel approach to time series forecasting. It provided unparalleled out-of-the-box performance and interpretability, however we needed more modeling power. NeuralProphet adds deep learning terms to Prophet, which are governed by three neural net configurations. NeuralProphet significantly improves model fitting capacity, but slows down performance and reduces explainability.

If Facebook Prophet isn’t cutting it, try NeuralProphet.

*Thanks for reading! I’ll be writing 25 more posts that bring academic research to the DS industry. Check out my comment for links to the main source for this post and some useful resources.*

How to Develop Interpretable Time Series Forecasts with Deep Learning was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

]]>Understand how bell curves are formed and their counterintuitive relationship to the number Pi

]]>…explained in plain English

]]>Biomedical NER+L is concerned with extracting concepts from free text found in Electronic Health Records (EHRs) and linking them to large biomedical databases like SNOMED-CT and UMLS.

In this post, we will focus on what to do once concepts are extracted from free text, in other words, how to make them useful to clinicians and other researchers.

**Prerequisites**: None, unless you want to replicate the results then you need to be able to use MedCAT and Neo4j.

Assume we got access to a hospital and annotated all the free text in Electronic Health Records for SNOMED concepts. If we have used MedCAT then the output is stored into a.json file. Our aim is to move the extracted concepts into a database that will allow anyone to write simple queries and utilize the annotations, e.g.:

- Return all patients that have
*diabetes*and*leg pain* - Return all diseases found for
*patient X***or** - Return all patients that have the symptom
*vomiting*and are taking the drug*valpam* - Return all male patients that do not have Alzheimer's but have had two or more mentions of seizure.
- Return all patients that have Dementia or any other disease that is a direct child of the concept Dementia in the SNOMED ontology.

While a relational database could do the job (with some twists and turns), a graph database is better suited. Why? The first reason is that we want to have the SNOMED/UMLS ontology (basically a ~directed graph) imported so that we can easily fulfil the last query from the examples above. And second, our annotations can be easily represented as a graph: *Patient-[HAS]->Document-[HAS]->Concept*. In words: a patient can have one or many documents, a document can have one or many mentions of a concept (e.g. diseases).

The rest of the post will be, to some extent, a tutorial (accompanied by a Jupyter Notebook). It assumes that you have a Neo4j database installed and available on localhost:7474. As we cannot use real data, I’ve created two dummy files from mtsamples:

- patients.csv contains basic information about all patients
- documents.csv contains the document text + a document ID

Before we do anything we need to generate the annotations for the free text available in documents.csv if you already have documents processed with MedCAT or any other NER+L tool, then you can skip this step.

from medcat.cat import CAT

df_docs = pd.read_csv('./documents.csv')

# This would be a generator if we have a lot of docs

data = [(k,v) for k,v in df_docs[['documentId', 'text']].values]

# You can find the model in the medcat repository

cat = CAT.load_model_pack('./medmen_wstatus_2021_oct.zip')

docs = cat.multiprocessing(data, nproc=10)

json.dump(docs, open("./annotations.csv", 'w'))

Create indexes to speed up data ingestion and search (as the dataset can be very large)

from medcat.neo.data_preparation import *

from medcat.neo.neo_connector import NeoConnector

import pandas as pd

# Helper for sending requests to neo

neo = NeoConnector('bolt://localhost:7687/', user='neo4j')

# Indexes are pre-defined in the data_preparation helper

for ind in get_index_queries():

try:

neo.execute(ind)

except Exception as e:

print(e)

Next, start the imports.

# Import Patients

df_pts = pd.read_csv('./patients.csv')

# If you cannot write to this output_dir save somewhere else and copy

q = create_patients_csv(df_pts, output_dir='/var/lib/neo4j/import/')

# Run the query for import

neo.execute(q)

In the neo4j browser (localhost:7474) there will now be 182 new nodes with the label *Patient*. The same will be repeated for *Concepts*, *Documents* and finally *Annotations*. Details are in the accompanying Jupyter Notebook, here we are skipping and assuming everything is imported.

The following query should now work and return a sample of the data:

MATCH (p:Patient)-->(d:Document)-->(c:Concept) RETURN p, d, c LIMIT 10

The output graph(differences are possible, important is to get back all three node types):

We can now easily translate each one of the natural language queries we’ve shown above into Cypher (the neo4j query language). But, if we do not know Cypher, MedCAT has a small class that can be used to query Neo4j:

# Get all patients that have the concepts (Fever and Sleep Apnea). #We will set the ignore_meta flag as the medmentions model does not

#have all required meta annotations.

patients, q = neo.get_all_patients(concepts=['C0520679', 'C0015967'], limit=10, ignore_meta=True)

Or to get all diseases assigned to a patient or document:

# Get all concepts from one patient

stream, q = neo.get_all_concepts_from(patient_id='32', bucket_size_seconds=10**10)

The results can be displayed as a Dataframe:

That’s it, the medcat.neo library will be updated over time with new queries and functions for import/export.

Thank you for reading.

Exploring Electronic Health Records with MedCAT and Neo4j was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

]]>It’s a perfect fit and pays really well!

]]>Learn the most common metrics you can use to evaluate your classification models — in this article we will explore 6 metrics including…

]]>Insight into the speed benefit of NumPy functions

]]>Data-backed exploration of the fastest-growing data science and machine learning libraries

]]>Using Regularization to Prevent Overfitting in Deep Learning Models

]]>This article is a comprehensive overview of different open-source tools to extract text and tabular data from PDF Files

]]>