Statistical Thinking

What is p-value short for?

Renaming that pesky little number and relearning how to use it

Cassie Kozyrkov
Towards Data Science
5 min readAug 21, 2020

--

Is p for probability?

Technically, p-value stands for probability value, but since all of statistics is all about dealing with probabilistic decision-making, that’s probably the least useful name we could give it.

Instead, here are some more colorful candidate names for your amusement.

Painful value: They make you calculate it in class without explaining it to you properly; no wonder your brain is hurting. Honorable submissions in this category also include puzzling value, perplexing value, and punishing value.

Pesky value / problematic value: Statisticians are so tired of seeing ignoramuses abuse the p-value that some of them want to see it abolished. They wish they could shake people, yelling, “It’s a tool for personal decision-making, not that other thing you think it is!”

Persuasive value: As I’ll explain in a moment, trying to use a p-value to persuade someone is a dangerous bet that your victim is more ignorant than you are. If you’re going to appeal to p-values to spice up your message, may I recommend rewriting all your arguments in Latin while you’re at it?

Publishable value: Speaking of ways to abuse the p-value, if you’re one of those “scientists” who feels no remorse torturing (“p-hacking”) your data until it confesses the kind of p-value you think will impress reviewers of an academic journal, you’re part of the problem and not the solution.

Pay value: If you think academia is the only place where your salary depends on your ability to cook up good-lookin’ p-values, think again!

Punchline value: Classical statistical inference boils down to asking “Does the evidence we collected make the null hypothesis look ridiculous?” The p-value is the punchline, summarizing the answer to this big testing question in one little number.

Plausibility value: The higher the p-value, the more plausible your evidence looks in a universe where we’re not totally nuts to stick to our default action. Notice that this is about the plausibility of your evidence in a particular kind of world… NOT the plausibility of that world itself!

Passivity value: The higher your p-value, the less reason you have to change your mind. Keep doing whatever you passively planned to do. To understand why, read on. (But bear in mind that a lack of evidence is not the same thing as evidence of a lack. A silent smoke alarm doesn’t always mean there’s no fire.)

If you prefer videos, here’s part 1: What is a p-value? It might make you think p might be short for “puppy”…

P is for Punchline!

Remember how we boiled statistical inference down to one sentence? It was:

Does the evidence we collected make our null hypothesis look ridiculous?

The p-value is the punchline to that question. It summarizes the answer in one little number. The lower the p-value, the more ridiculous the null hypothesis looks!

So, how do we turn the answer into a yes or a no? We simply set a threshold in advance to indicate what’s ridiculous enough to change our minds. The fancy name for that threshold is the significance level. If the p-value is below it, change your mind. If not, keep doing what you were happy to do by default.

What it *is* versus what it *does*

A wonderful thing about p-values is that they’re easy and relatively safe to use… if you picked the right test for your null hypothesis and assumptions. (That’s a big if!) But don’t forget that what you’ve just learned is what they do, not what they are.

Don’t make the mistake of trying to understand what they are in a pithy one-liner.

What they are is something weird: probability statements about samples in a specific imaginary universe. They’re most definitely not that straight-forward thing you want them to be; they weren’t designed to be intuitive to interpret or pithy to describe. They’re made for reading off the output of a hypothesis test.

So, what are they? To see that, you’ll need to understand how we calculate them. I’ve written about that in my other articles, e.g. here, so I’ll stick to a summary here.

Summary: How do you *get* a p-value?

Calculating a p-value is a five-step process.

  1. Choose the default action.
  2. State the null hypothesis.
  3. State the assumptions about how the world described by that null hypothesis works.
  4. Make a model of that world (using equations or simulation) — this is the bulk of the work for statisticians.
  5. Find the probability that this world coughs up evidence at least as bad as we’re seeing in our real-life data.
Part 2: How do you get a p-value?

Summary: How do you *use* a p-value?

  1. Compare it against the significance level.
  2. Change your mind if the p-value is below the significance level. Otherwise, just keep doing what you were going to do if you never analyzed any data.
How to use p-values to get the outcome of your hypothesis test. (No one will suspect that my xkcd is a knockoff. The lack of humor won’t tip anyone off. )
Part 3: How do you use a p-value?

Summary: Short explanation

A p-value asks, “If I’m living in a world where I should be taking my default action, how unsurprising is my evidence?” The higher the p-value, the less ridiculous I’ll feel about persisting with my planned action. If the p-value is low enough, I’ll change my mind and do something else.

Polemical value / polarizing value: If you want to learn about the p-value controversy and read my take on all the emotions the p-value causes, check out the next article in this series: Why are p-values like needles?

Part 4: Check your understanding with this summary!

The safest way to use a p-value

In order to interpret a p-value, you must know every detail about the assumptions and null hypothesis. If that info’s not available to you, the only valid interpretation of a low p-value is: “Someone was surprised by something.” Let’s all meditate on how little that tells you if you don’t know much about the someone or the something in question.

Interpret a low p-value as: “Someone was surprised by something.”

Trying to use a p-value to persuade someone is a dangerous bet that your victim is more ignorant than you are. Those who understand what it is might not appreciate your attempt at insulting their intelligence.

Thanks for reading! How about an AI course?

If you had fun here and you’re looking for an applied AI course designed to be fun for beginners and experts alike, here’s one I made for your amusement:

Enjoy the entire course playlist here: bit.ly/machinefriend

Connect with Cassie Kozyrkov

Let’s be friends! You can find me on Twitter, YouTube, Substack, and LinkedIn. Interested in having me speak at your event? Use this form to get in touch.

--

--

Chief Decision Scientist, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own. twitter.com/quaesita