Hi Source: Pixabay

What the Law that Forecasted AIDS, Ebola, & SARS Has to Say About the Coronavirus

A Little-Known ‘Law of Epidemics’ from the 1840s

Andre Ye
Towards Data Science
5 min readMar 18, 2020

--

There’s a little known statistical law from the 1840. Even though its creator, William Farr, is long gone, his simple Law of Epidemics has shown to be applicable to many modern events — and it has a lot to offer on the coronavirus.

Note from the editors: towardsdatascience.com is a Medium publication primarily based on the study of data science and machine learning. We aren’t health professionals or epidemiologists. To learn more about the coronavirus pandemic, you can click here.

What is Farr’s Law?

Farr’s Law states that epidemics tend to rise and fall in a roughly symmetric, Bell-curve shaped path (normal distribution).

A Bell Curve and statistical properties. Source. Image free to share and use commercially.

This means that in an epidemic, the number of cases starts out small, incrementally picking up pace, then slowing down as it reaches a peak, before sloping down at a roughly symmetrical rate to how it approached the peak, eventually dying down to the point where there are still a few cases (perhaps per year).

However, when forecasting the turnout of an epidemic with a limited amount of data, there are a variety of possible normal-like distributions that vary in standard deviation and means. Take this example that outlines various possible curves given some HIV data points (stars).

Nadia Abuelazam. Image free to use with credit.

Farr’s law says that the statistical normal distribution (with standard deviation of 1 and mean of 0) can be shifted (all values of x increased by some constant) or scaled (x and y values multiplied by some constant) to fit the data.

However, the standard deviation should remain as close to 1 as possible — this is the defining feature of a Bell curve. Sometimes the standard deviation needs to be adjusted to account for real-world context, such as the scale of dates.

Farr’s law has shown itself useful in several epidemics, including AIDS (and HIV), Ebola, and SARS. Part of what makes it successful in epidemic forecasting is that it does not rely on complex polynomial regressions that fit the data in certain segments and then spin wildly out of control — but more importantly, it agrees with what epidemiology says about the nature of epidemics and pandemics, namely;

  • The epidemic starts off small, with a few people infected.
  • The epidemic progressively gets worse as the number of people infected grows exponentially (the derivative goes up).
  • At some point, most everyone within the region of infection who is susceptible to being infected has been infected, and the infection rate begins to become less steep (the derivative decreases closer to 0).
  • The epidemic/pandemic reaches a peak. This is usually when some solution or vaccine is introduced and implemented across a reasonable area.
  • More people are being infected, but progressively less and less, until it reaches near zero
  • At the tail of the epidemic, a few people (perhaps who have not been vaccinated from the virus) will still get the virus every year.

So what does Farr’s law have to say about the coronavirus?

Applying Farr’s Law to the Coronavirus

China’s situation with the coronavirus seems to be dying down. Just from the real world data, one can identify the gradual slowing down of cases signature of a peak. The number of coronavirus cases in China will probably begin to be tapering down in the next few weeks.

With the United States, there is much less data as the epidemic has only really began to accelerate. Here’s a very, very optimistic estimate.

This reveals one issue with Farr’s law — when there is little data, it is difficult to scale the normal distribution. The estimate could have just as easily been

However, Farr’s Law is helpful in that if given one of two variables — the expected time frame (how wide the normal distribution will be) or the maximum expected given confirmed cases (how large the peak is), Farr’s law can solve for the other variable.

Given the White House’s recent estimate of an 18-month-long pandemic, here is an estimate for how the coronavirus might run through the US. It would peak at a little over 5 million people (out of a 327 million population).

Be aware — the dataset we are working with is confirmed cases. Health agencies have stated that demographics at low risk of the coronavirus do not need to be tested, which may mean that the data under-captures the true number of confirmed cases.

The number of deaths does not have the same issue. The current number of deaths is still very low, but according to Farr’s Law the deaths will peak at 3,000 deaths (.0009% of the US population, 1 in every 109,000 people).

Note that Farr’s law, like many forecasting methods, has a weakness in that it is limited to numerical data — it is much like predicting the stock market without knowledge of what is happening politically or economically. The most trustworthy sources are from the CDC and other health organizations that have access to more dimensions of data than Farr’s law has.

Conclusion

Farr’s Law is an incredibly simple tool to map out the general trend of epidemics and pandemics. While it succeeds at doing this, it has little access to higher dimensions of data and could be wrong in the finer details. The fact is that the coronavirus will get worse for the next few months — but when the death count begins to slow down, you have Farr’s Law to know that the coronavirus has probably reached its peak.

--

--