The world’s leading publication for data science, AI, and ML professionals.

Stock Market Outcomes Are Currently Bernoulli Distributed

What That Means For Equity Risk

Photo by Simon on Unsplash
Photo by Simon on Unsplash

Note from Towards Data Science‘s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.

We currently stand at a crossroads. Stimulus or no stimulus? Vaccine or no vaccine? A growing wave of COVID and hospitalizations or a quick plateau?

These questions are currently unanswerable – everyone has an opinion but nobody has the answer.

This vast uncertainty about the future is why the VIX (which measures the expected forward volatility, a.k.a. annualized standard deviation, of the S&P 500’s returns) has remained stubbornly elevated despite stock markets hovering near all-time highs. I wrote about this phenomenon previously:

What’s Up With The VIX?


What’s The VIX And Why Should We Care?

Finance and data science have a ton of overlap. I go into much more detail in the previously linked post but let’s quickly go over what the VIX is.

The VIX is the implied volatility of the S&P 500. In other words, it’s the annualized standard deviation of the S&P 500’s return (over the next 30 days) expected by the market. It’s calculated via the prices of a basket of options on the S&P 500 (implied volatility is a key driver of option prices, so given the price, we can back out the implied volatility).

Even though it claims to be forward looking, historically the VIX has looked a lot like the S&P 500’s realized volatility over the past 30 days – so it hasn’t been very forward looking:

VIX vs. past 30 days realized volatility (Source: Sharadar, Graphic created by author)
VIX vs. past 30 days realized volatility (Source: Sharadar, Graphic created by author)

But that doesn’t mean it’s completely without signal. Many investors, including I myself, have complained about the recent complacency of markets against a potentially catastrophic economic backdrop.

While raw equity prices don’t reflect much fear, implied volatility does. Look at the following scatter plot. It plots where the S&P 500 is trading relative to its previous all-time-high (x-axis) against the level of the VIX (y-axis). There is a clear negative relationship between them – meaning that when the S&P 500 is trading close to its previous all-time-high or making new highs, the VIX tends to be low (and when there are large drawdowns, the VIX tends to be high).

Note: a VIX level of say 50 can be interpreted as an expected annualized volatility over the next 30 days of 50%. That translates to an expected 30 day volatility of 50%/sqrt(12) = 14.4%.

S&P 500 drawdown from prior peak vs. VIX (Source: Sharadar, Graphic created by author)
S&P 500 drawdown from prior peak vs. VIX (Source: Sharadar, Graphic created by author)

The blue dots are all the trading days before 8/1/2020 (from 1/1/2010 to 7/31/2020) and the orange dots are all trading days from 8/1/2020 and onwards. Notice how for a given drawdown level (on the x-axis), the orange dots are either at or beyond the top range of the blue dots.

This means that markets are behaving somewhat anomalously – VIX is higher than it historically has been when controlled for the stock market’s level.

For reference, in February of this year, prior to all the madness, the SPY (a popular S&P 500 ETF) was trading around 335. At the same time, VIX was at 13.7 (13.7% annualized implied volatility). As of this writing, the SPY is higher than it was back in February with a value of 343, yet the VIX is at 29 – more than double what it was! In other words, the volatility of the S&P 500 over the next month is expected to be twice as high despite stock prices having fully recovered.

The major practical implication is that portfolio insurance (VIX call options and S&P 500 put options) are much more expensive today than they were at the start of this year. And that’s made it much harder to hedge equity risk.


Why Is This Occurring?

The news will say the usual – election uncertainty, economic anxiety, SoftBank buying up massive amounts of tech stock call options, etc.

True, those all impact things on the margin. But I think the answer is simpler. Usually when we look forward, market returns are expected to be roughly normally distributed (see linked blog post below if you would like the details).

Are Stock Returns Normally Distributed?

If you plotted all the daily returns out on a histogram, it should look more or less like a bell curve and it does:

Actual return distribution vs. theoretical (Source: Sharadar, Graphic created by author)
Actual return distribution vs. theoretical (Source: Sharadar, Graphic created by author)

Currently, market returns are not normally distributed. Below, I’ve plotted the S&P 500’s daily return distribution of the past two months (orange) against the return distribution of the two months before the COVID market crash (blue). Notice how the orange distribution looks bimodal.

Actual return distribution vs. theoretical (Source: Sharadar, Graphic created by author)
Actual return distribution vs. theoretical (Source: Sharadar, Graphic created by author)

Recently the market action can be described as – on the days when market participants lower their probability of getting more government stimulus, the market goes down. On days when they raise their probability of government stimulus, the market rises.

But ultimately, it will either be yes or no. Either the economy gets enough stimulus to jumpstart it in conjunction with COVID receding or it does not. In other words, either we get a happy outcome or a sad outcome, no in between.

Bimodal outcomes like that can be modeled with a Bernoulli distribution. An example of a Bernoulli distributed random variable is a coin flip – either you get heads with probability p or tails with probability 1-p.

That’s like the stock market right now – either things turn out OK (stimulus, vaccine, economic recovery) or they don’t (not enough stimulus, vaccine doesn’t work, economic disaster).

A Bernoulli distributed random variable has a standard deviation of:

Standard deviation of Bernoulli random variable
  = p*(1 - p)
Where p is the probability of success.

Let’s come up with some numbers and calculate a standard deviation. Assuming the following:

  • 75% chance of things turning out well.
  • If things turn out well, the market will rise 20%.
  • If things turn out badly, the market will crash -50%.

We can calculate the expected standard deviation of the stock market as follows:

stock market standard deviation
= (0.2-(-0.5)) * (0.75*(1-0.75))**0.5 = 30.3%
which implies of VIX of 30.3
** denotes exponent

This is pretty close to the current observed VIX level of 29.3. If you want to double check the above calculation you can run the following Python code. It produces the same answer.

import numpy as np
result = []
for i in range(100000):
    if np.random.random() > 0.75:
        result.append(0.2)
    else:
        result.append(-0.5)

np.std(result)

Bernoulli Distributed Returns Are Riskier

29.3 is pretty high. For reference, the mean realized 30 day volatility of the S&P 500 over the past ten years has been 16.5%.

I had previously thought that implied volatility was too high and thus option premiums were rich. I no longer think that. A coin flip outcome is scary because it’s very risky. You may quibble with the inputs I used above in my calculation, but it’s clear that if the health and economic situation don’t get better soon, then it will get a lot worse. There’s only so much the Federal Reserve and QE can do.

Why do I say Bernoulli distributed returns are riskier? Imagine two games:

  1. Game 1: you flip 100 coins. For each coin, if it’s heads you win $1,500, if it’s tails you lose $1,000. The outcome of game 1 will be normally distributed.
  2. Game 2: you only flip one coin. If it’s heads you win $150,000, if it’s tails you lose $100,000. The outcome of game 2 will be Bernoulli distributed.

The expected value of both games is $25,000. But they are obviously not equal. Game 2 is much riskier – either you’re rich or you are out a ton of Money. We can simulate both games with the following code:

# 100K simulations of  Game 1 - Flip 100 coins
all_results = []
for j in range(100000):
    result = 0
    for i in range(100):
        if np.random.random() > 0.5:
            result += 1500
        else:
            result += -1000
    all_results.append(result)

print('Game 1:', np.mean(all_results), np.std(all_results))
# 100K simulations of  Game 2 - Flip 1 coin
result = []
for i in range(100000):
    if np.random.random() > 0.5:
        result.append(150000)
    else:
        result.append(-100000)

print('Game 2:', np.mean(result), np.std(result))

From my code’s outputs, I can see that game 1 (where we flip 100 coins) has a standard deviation in outcomes of $12,500. That’s pretty good compared to its expected value of $25,000.

Game 2, on the other hand, has a standard deviation of $125,000! That’s huge compared to its expected value.

We can visualize this with the histogram of the two games’ simulated outcomes. Despite sharing the same expected return, game 2 clearly has much more risk of downside. There’s very little chance of losing money with game 1 (and even then you only lose a little bit). With game 2, you have a 50% probability of losing a huge chunk of change.

Histogram of outcomes for game 1 and game 2 (Graphic created by author)
Histogram of outcomes for game 1 and game 2 (Graphic created by author)

While this example is stylized, it’s similar in spirit to the current market situation. In facing a coin flip, stock market participants currently face as much uncertainty as they ever have. And while it might not be priced into stock market valuations, it is priced into the VIX, which corresponds to the price of investment insurance. Be careful everyone!


More Stories By Yours Truly

iBuyers Bring Convenience And Liquidity To The Real Estate Market

Understanding Linear Regression


Related Articles