The world’s leading publication for data science, AI, and ML professionals.

The Devil’s Roulette: a Failure of Expected Value

Notes on Risk, Reward and Ruin in Stochastic Games

With the propagation of data, technocrats and pop-books about statistics an increasingly large portion of the population are familiar with Expected Value. For the uninitiated Expected Value is the ‘average’, result from some random process –statistics texts denoted this as E[X]. Sometime this means the most common result . If you randomly selected a computer science PhD candidate you would expect them to be a man because the overwhelming proportion of comp sci PhD candidates are men. Sometimes this means the value that minimizes your error. If you sample a person at random in the US you would expect them to be about 38, not because most people are 38 but because it minimizes the amount you miss by on average.

Expected Value is a very useful concept in many instances. It will tell you what to, well, expect. And if we know what to expect it is not unreasonable to think we should know what to do. If you are going to Iceland in winter you should expect it to be cold and pack a coat. If you go to med school you should expect to accrue $200k in debt and maybe just be a software engineer instead. If you are offered a bet that has a positive expected return you should take it.

The focus of this article will be on that last statement. A great interview question guaranteed to mess with new grads from quanty programs: can you devise a betting game with negative expected value that you are guaranteed to win? Somewhat surprisingly the answer is yes, though it requires some supernatural oddities. Suffice to say, you should never play roulette with the Devil.

The Stooge

Let’s introduce the first character. The Stooge is a recently minted PhD from some notable institution in a field like Political Science. The Stooge is familiar with statistics & hypothesis testing and has a firm belief in their power. They live their life by a few simple maxims:

  1. If the expected value of an event is positive do it
  2. Take the smallest risk possible so you can play the game again
  3. If something seems too good to be true it probably is, so be careful

Seem like good rules. The first makes sure you won’t play a losing game like the suckers buying lotto tickets. The second makes sure that you tend to capture the expected value of an event and reduce your exposure to variance. The third is due diligence.

After a long day of polling work in near the Mississippi delta’s in late August our protagonist begins to head to his hotel. If you have never had the pleasure of living in Mississippi, its hot there in August. The pollster wanted a drink. At an intersection near his hotel he sees a bar he hadn’t noticed earlier and it just happens to have his favorite beer so in he goes.

The bar is a little divey with an oddly well-dressed bartender, a few patrons and a video roulette machine. The Stooge sits down next to it, curious how fleeced players would be. The game is mostly what you would expect: 38 numbers on a dial colored black & red (curiously no green), and odds that are only slightly awful. A single number bet (1/38 chance to hit) pays out 37x, a $1.00 bet tends to return $.97…not good. The first 12 (12/38 chance to hit) pays out 3x, your dollar gets you $.94 back…still not good. Anyone familiar with a casino knows that they make their money by having each bet favor the house, but only slightly.

Curiously, though, a color bet (1/2 chance to hit) also pays out 3x. The stooge’s interest is piqued. The expected value of a dollar bet is $1.50. It’s a winner. The machine is certified by Mississippi’s Society for Actuarial Testing and Accurate Numbering, a reasonably well regarded organization in the gambling world. The bet size for color bets is $10, and a few cents for all others.

A Strange Game

Our protagonist consults his maxims. Is the payout positive? Absolutely, if the game is fair he’s drinking for free tonight. Is the risk small enough he can play again? With a $200 in petty cash for the trip he can play 20 times. Does it seem too good to be true? Yes! The game could be rigged even in spite of the certification.

The Stooge has a solution, he takes out a dollar and plays 50 rounds betting on individual numbers recording the red and black results as well as wins. 27 red, 23 black, 2 wins. Applying a simple T-Test (a statistical test for the veracity of ideas) the results imply a fair game, or at least one that’s not sufficiently rigged against the player to make the color bet unattractive. The Stooge loads $200 into the machine.

He wins the first roll and is up $30. The machine changes the bet size for a color bet to $40. Spin again, winner, up $120. The bet size changes again to $130. As The Stooge considers walking away the payout on the machine improves to 4x. Another spin, a loss. He’s down $10 now, the game resets the bet size to $10 and the payout to 3x. The bet is still small, the expected value still positive, and now the Doc is stuck. He spins again.

An example of 5 rounds of the game
An example of 5 rounds of the game

The Devil Wins Every Time

This game continues until the bettor is out of cash, and eventually he will always be out of cash. Note the challenge here is to devise a game that doesn’t simply tend to make The Devil win, but guarantees it. Guarantee means you will eventually get a result with absolute certainty, while tends implies only that a results it more likely than not. What is guaranteed in this game is that the bettor will eventually lose. We exploit that by increasing the bet size in proportion to the payout trying to coax the opponent into ruin.

Ruin here can mean a few things. First it can describe a situation where the Stooge get’s phenomenally unlucky and loses 20 spins in a row. The odds of this are astronomically small, and the results uninteresting so we’ll note it can happen and move on. Second it can describe what happens when you bet too much and lose, even if the odds look good. In either case your wallet gets much thinner.

Example of a game and the slow march to 0
Example of a game and the slow march to 0

If we analyze this game in a little more detail we see that the roulette spin itself is a recurrent subgame to the much larger actual game. The subgame has positive expected value, you will be right half the time and receive more than a 2x payout for it. The recurrent game has negative expected value: you will eventually lose and when you do you will lose everything.

Another way to interpret this: while our protagonist has good rules of thumb for navigating an uncertain world, they have no grounding concept of risk management. A decent start here might be to add a 4th maxim that increased risk should be meted out with increased reward. In investment speak we’d like generate alpha, in business speak we’d like to get positive risk adjusted NPV. Most people would take the second bet in the game’s series: $40 for a 3x payout. Let’s try a strategy where you walk away when the ratio of payout multiplier to bet size is less than .1:

A few examples of Risk aware stopping, you still lose everything
A few examples of Risk aware stopping, you still lose everything

Not so great. Your opponent can increase the payout to keep you playing, and while your returns are spikier – in one of the sample games above your wallet climbs above $4000 – you still eventually lose everything.

Know When to Walk Away

Yet another Interpretation of this game is as follows. The Devil is paying The Stooge a premium early to make suicidal bets later. Suicidal bets are bets so large that they result in ruin. In my experience it’s a good idea to avoid suicide. One way to do this is to not play. Not playing has an expected value of 0, which is better than losing everything but worse than a policy of walking away after the first spin – expected return $15.

If you do end up playing Roulette with the Devil you should have 2 goals. First is to not go bust, many strategies that lets you walk away if the bet is above a certain low threshold will accomplish this. It’s not every day that it cost a dollar to buy $1.50 though, leading to the second goal: get what you can while you can. Threshold approaches don’t do this. Instead we need a strategy that metes out reward with risk of ruin rather than just risk of loss.

A few games with Kelly Style Stopping
A few games with Kelly Style Stopping

The above runs were generated using a specific stopping strategy known as the Kelly Criterion. All of them end with positive returns, though Kelly stopping does not guarantee a win it does tend to produce them. The derivation and history of the formula is beyond the scope of this article, though its well documented. The intuition, however is well within scope. If we’re thinking about risk, reward and ruin we know:

  1. We are only ruined when we lose
  2. We only collect payouts when we win
  3. We want to make bigger bets as payout improves
  4. We become ruined when our bets become to large relative to our wallet

If we turn these statements into a formula we would want to limit our bets based on the likelihood of losing, the payout, and size of our wallet. As our wallet grows, the payout increases or probability of winning increases we increase the max bet we are willing to make. As we lose money, the payout decreases or our chance of losing grows we bet less or walk away. Finally, we want a limiting factor that, even if payouts become arbitrarily large allows us to walk away. Kelly stopping does this in an optimal way.

In the real word of course we often don’t know the payout nor do we know the odds. We hire quants, data scientists, experts and late night voodoo priests from MBBs to try to work them out; though, of course, there is uncertainty in their forecasts. One thing we do know, though, is what we can reasonably lose and keep moving forward. While probabilities and payouts are important, for any sufficiently long game they pale in comparison to managing ruin.


Disclaimer: The views expressed herein are those of the author and not necessarily those of any employer, organization or affiliate thereof.

© 2021 Douglas Hamilton


Related Articles