Statistical Thinking

How to set up hypotheses

Cassie Kozyrkov
Towards Data Science
5 min readJun 18, 2019

--

Some readers asked for an easy fact-based (no uncertainty or probability) example to accompany my article Never start with a hypothesisto show you how setting up the decision context works. Your wish is my command!

Let’s play through two scenarios with two different default actions to see how it works. Imagine I’ve just gotten a call from my friend: “Shall we go out tonight?”

Step 0: Get in touch with your feelings

Scenario 1 - I live in a state of permanent FOMO. The streets are full of adventure!

Scenario 2 - It has been a long week. How comfy is this couch? Soooooo comfy.

The scenarios in this example have nothing to do with one another. They represent two parallel universes. I’m only showing you both for pedagogical purposes; in practice you only live in one world and so you only see one scenario.

Step 1: Write down the default action

Scenario 1 — By default, let’s go out! Convince me not to.

Scenario 2 — By default, I’m staying in. Convince me not to.

It’s up to the decision-maker to pick their default action however they like. This action is simply whatever you commit to doing if you get no more information. These default actions are plausible for different versions of parallel-universe-me. Neither one is “correct” in any way except insofar as I’m honest about doing it.

Step 2: Write down the alternative action

Scenario 1 — The alternative action is staying in.

Scenario 2 — The alternative action is going out.

This is simply the opposite of the default action.

Step 3: Describe the null hypotheses (H0)

Scenario 1 — I’m happy to do my default (go out) if the weather’s good. The null hypothesis is that there’s no rain.

Scenario 2 — I’m happy to do my default (not go out) if there’s no fun live music. The null hypothesis is that my music options are certifiably lame tonight.

Step 4: Describe the alternative hypotheses (H1)

Scenario 1 — What’s the opposite of my null? Rain. Ugh. Worse than missing out. That would convince me to change my mind.

Scenario 2 — What’s the opposite of my null? A band I like has a live show near me. That would convince me to change my mind.

Now let’s look at three different ways some data might arrive…

Analyze the evidence and decide! (Boring data)

Scenario 1 — You show me data: “Scientists just discovered a new kind of sea slug.” What should I do? I learned nothing that changes my mind, let’s go out!

Scenario 2 — You show me data: “Scientists just discovered a new kind of sea slug.” What should I do? I learned nothing that changes my mind, I’m staying in.

Notice that the sea slug factoid leads to a failure to reject the null, not an acceptance of the null. It might still be raining, I just don’t know that from the data, so I keep doing what I was going to do anyway.

Analyze the evidence and decide! (Surprising data)

Scenario 1 — You show me data: “It’s raining.” What should I do? Not my default action. I’ll stay in.

Scenario 2 — You show me data: Spencer Krug is on stage in an hour.” What should I do? Not my default action. I’ll get off the couch for Spencer any day.

These are easy facts to interpret — they make my null hypothesis look ridiculous (in fact, the p-value is 0) so they force me to take the alternative action in each case. What if we’d gotten the same data but we flipped which scenario (world) it shows up in?

Analyze the evidence and decide! (Boring data)

Scenario 1 — You show me data: Spencer Krug is on stage in an hour.” What should I do? I learned nothing that changes my mind, let’s go out!

Scenario 2 —You show me data: “It’s raining.” What should I do? I learned nothing that changes my mind, I’m staying in.

If you’re not careful, you might make the mistake of thinking that the evidence has anything to do with the decision. Actually, this evidence doesn’t change my mind any more than the sea slug factoid would… even though I’d rather go to a good show. I would have made the same decision in Scenario 1, regardless of the music. Sure, seeing an awesome show makes me feel better about going out, but I would have done that anyway.

If you’ve absorbed this, you’re ready to add some nuance — dive into “Statistical inference in one sentence” for a deeper example.

Thanks for reading! How about an AI course?

If you had fun here and you’re looking for an applied AI course designed to be fun for beginners and experts alike, here’s one I made for your amusement:

Enjoy the entire course playlist here: bit.ly/machinefriend

Liked the author? Connect with Cassie Kozyrkov

Let’s be friends! You can find me on Twitter, YouTube, Substack, and LinkedIn. Interested in having me speak at your event? Use this form to get in touch.

--

--

Chief Decision Scientist, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own. twitter.com/quaesita