Understanding the 2020 US Stock Market Drop using Data Visualizations

Analyzing the severity of the COVID-19 related US stock market sell-off using data visualizations and a historic context.

Harsh Rana
Towards Data Science

--

If you don’t know about the US stock market drop that occurred over the last 2 months, then you’re either:

  1. A college student, OR
  2. An entry-level worker with no investments and student loans worth a lifetime

Either way, whether or not you’re invested in the market, you should know about what’s happened over the last few weeks. Depending on where you stand, you’re either lucky or unlucky enough to be witnessing a Black Swan event. Quick level set: a black swan event is defined beautifully by Investopedia as:

An unpredictable event that is beyond what is normally expected of a situation and has potentially severe consequences. Black swan events are characterized by their extreme rarity, their severe impact, and the widespread insistence they were obvious in hindsight.

In this article, I’ll be quantifying the impacts of the 2020 US stock market drop and comparing it with other major drops from recent (and not-so-recent) history using data science. I’ll be gathering data to help build my cases, constructing comparative statistics to give context and building visualizations to communicate important ideas.

One last thing, I’m only using publicly available data so all the code for the data visualizations and raw data used in this article is readily available on my project GitHub. Now, let's get started!

Quantifying the market turmoil over the last few weeks

There are a few different datasets we can use to understand the absolute turmoil in the markets. Let's start by analyzing the standard statistics of the S&P 500 over the last ~3 months (01/02/20–03/20/20). This is what my dataset looks like:

Standard dataset showcasing open, high, low, close, change% and volume of the S&P 500

You probably see these descriptors every time you search for a stock on the web; now, let's do some analyses to understand them better. I’ll start by showing the simple price/timeline chart which already reveals a lot of information.

You can clearly see that sometime around the middle of February (~02/19), the market volatility started increasing aggressively. Note that the volatility increase is the focus, not necessarily whether the trend was up or down. Let’s take this one step further by adding a visualization of market volume over time, which showcases just how many transactions were taking place.

Now you can also see that market volatility and volume both increased, specially over the last few weeks. At least on the surface, another thing you can see from the graph above is that the first time the volume really skyrocketed was a day or two after the market took the first few big drops.

Let’s use another visualization to see % change in price and market volume in conjunction. These descriptors use different scales so we’ll use a derived statistic to see abnormalities. A good way to do this is using Z-score. Simply put, Z-score is a statistic that gives you an idea of how far from the mean a data point is. Given our sample size of just the last 3 months, the mean is extremely skewed, so I’ll be using the modified Z-score which relies on the median as it is less sensitive to outliers and skew. Additionally, I’ll only be using the absolute value of the modified z-score because I’m only interested in the distance from the median, not direction. You can see the results below.

Now we’ve confirmed that the abnormal % change in the market preceded the abnormal volume. From the graph, you can see that something happened over the weekend of February 22nd which led to the % change shooting up from ~0.5 standard deviation from the median, all the way up to ~2 standard deviations. In a follow up post I’ll be exploring the Coronavirus cases over that weekend to further understand what happened.

While looking at this timeline from a Z-score perspective is useful, one last visualization which further captures the panic in the market can be seen below, where I simply plot the daily % change in the S&P 500. Notice how the peaks and valleys really explode over the last few weeks and the number of outliers in the box plot of the same data.

We’ve spent the last few minutes analyzing the US market drop in isolation. In the next section, we’ll be comparing this drop with others in recent history to get a better context and really answer the following question:

Just how bad was this market drop?

Pretty. Bad.

One of the strategies which is most useful in understanding the behavior of the market at any point, is getting some historic context. In this section, we’ll be comparing the 2020 US stock market drawdown with others from recent history to further analyze the market reaction.

To do this, I collected data about the ten most recent 10%+ drawdowns in the market. As you can see, this list captures some 10–20% drawdowns and also the financial crisis of 2007–2008 and the dot com bubble of 2000, which were both much more severe. After this, I collected the daily S%P 500 returns during these time periods. My goal here was to compare, among other statistics, the rate of drop of the 2020 stock market drop with those from over the last 20 years.

After ideating a few different data visualizations, I decided to use a spaghetti plot to best showcase the severity of the most recent drop. The visualization can be seen below.

As you can see I decided to build the graph to showcase the drop in the value of a $10,000 investment from the start of the drawdown. This created a level ground where analyzing the severity of the drawdown was possible. As you can see from the graph, not only was the 2020 US market drop much more severe in the absolute drop in investment value, but it was also a much faster drop. This can again be seen in the following graph where I’ve separated each of the drawdowns.

Well, there you have it! The 2020 US market drop was by no means a minor event. Over the course of 30 days, it wiped out almost $8T from the overall market cap of the S&P 500. To be more specific, that’s more than $8,000,000,000,000. In the future, I’ll be looking at datasets associated with Coronavirus to further analyze the market behavior.

I hope you enjoyed reading this article just as much as I enjoyed putting it together. If you have any questions, feedback or comments, feel free to get in touch with me through LinkedIn, or my website. Additionally, all the code for the data visualizations and raw data used in this article is readily available on my project GitHub.

Stay safe out there everyone and keep building cool stuff.

--

--

A software engineer with a passion for data, personal finance and health.