The world’s leading publication for data science, AI, and ML professionals.

What is the Negative Binomial Distribution

A dive into one of the lesser known probability distributions

Photo by Alperen Yazgı on Unsplash
Photo by Alperen Yazgı on Unsplash

Background

Perhaps you have heard of the binomial distribution, but have you heard of its cousin the negative binomial distribution? This discrete probability distribution is applied in numerous industries such as insurance and manufacturing (mainly count-based data), hence is a useful concept for Data Scientists to understand. In this article, we will dive into this distribution and what problems it can solve.

?

To understand the negative Binomial Distribution, it’s important to gain intuition about the binomial distribution.

The binomial distribution measures the probability of measuring a certain number of successes, x, in a given number of trials, n. The trials in this case are Bernoulli trials, where every outcome is binary (success or failure). If you are unfamiliar with the binomial distribution, check out my previous post on it here:

Decoding the Binomial Distribution: A Fundamental Concept for Data Scientists

The negative binomial distribution flips this and models the number of trials, x, needed to reach a certain number of successes, r. This is why it is known as ‘negative’ because it is inadvertently modeling the number of failures before the certain number of successes.

A better way of thinking about the negative distribution is:

Probability of the "r" success happening on the "x" trial

A special case of the negative binomial distribution is the geometric distribution. This models the number of trials needed before we get our first success. You can read more about the geometric distribution here:

Geometric Distribution Simply Explained

Key Assumptions

The following are the main assumptions of the data for the negative binomial distribution:

  • Two outcomes per trial (Bernoulli)
  • Each trial is independent
  • The probability of success is constant

Formula & Derivation

Let’s say we have:

  • p: probability of success
  • 1-p: the probability of failure
  • x: number trials for r success
  • r: number of successes for x trials

Consequently, we must have r-1 successes in x-1 trials and the probability of this is simply the binomial distribution probability mass function (PMF):

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

The next bit of information we have is that the r success must occur on x trial, and it will have a probability of p. Therefore, we simply multiply the above formula by p:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

That’s the negative binomial distribution’s PMF!

The mean of the distribution can be shown to be:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Derivation of the mean and standard deviation can be found here

Example Problem

What would be the probability of rolling a second 4 on the 6th roll?

  • p = 1/6
  • r = 2
  • x = 6

Inputting these into the above PMF leads to:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

So, it is quite unlikely that we will get our second 4 on the 6th roll. You can also try out some of your calculations with this negative binomial calculator.

What about if we want to know the probability of rolling our second 4 on other rolls? Well, to do this we need to plot the second roll as a function of the number of rolls, x:

import plotly.graph_objects as go
from math import comb

# Parameters
r = 2
p = 1 / 6

# PMF
def neg_binomial_pmf(x, r, p):
    if x < r:
        return 0
    q = 1 - p
    return comb(x - 1, r - 1) * (p ** r) * (q ** (x - r))

# Values
x = list(range(1, 30))
probs = [neg_binomial_pmf(k, r, p) for k in x]

# Plot
fig = go.Figure(data=[go.Bar(x=x, y=probs, marker_color='rgba(176, 224, 230)')])
fig.update_layout(title="Negative Binomial Distribution",
                  xaxis_title="x (number of trials to get second 4)",
                  yaxis_title="Probability",
                  template="simple_white",
                  font=dict(size=16),
                  title_x=0.5,
                  width=700,
                  height=500)
fig.show()
Plot generated by author in Python.
Plot generated by author in Python.

We see that the most likely roll to obtain our our second 4 roll 6 and 7. However, the expected value is 12 (2/(1/6)), which can be derived from the formula we showed earlier.

Applications in Data Science

Below is a list of areas where the negative binomial distribution is used:

  • Time until an event: This is useful for churn models, where we want to predict when a customer may cancel their subscription. If we know when and who will churn, we can apply specialised retention strategies to try and keep the customer.
  • Defect prediction: Predicting the number of defects in a manufactured product before it becomes fully functional. You can think of this as how many versions of the product to make before we reach a final proposal.
  • Sports Analytics: There are several examples such as predicting after how many missed shots will a footballer score a goal. This is useful for betting companies to produce their odds.
  • Marketing: Determine how many advertisements to show a customer before they convert onto a subscription or click on the website. This is predicting conversion rates.
  • Epidemiology: Estimating the volume of endangered species and how the environment is affecting their numbers.

Summary & Further Thoughts

The negative binomial distribution models the probability of a certain number of failures it takes to reach a certain number of successes. This has applications in many areas of Data Science, the most notable being churn prediction. Hence, it is a useful topic for Data Scientists to understand.

The full code is available on my GitHub here:

Medium-Articles/Statistics/Distributions/negative_binomial.py at main · egorhowell/Medium-Articles

References & Further Reading

Another Thing!

I have a free newsletter, Dishing the Data, where I share weekly tips for becoming a better Data Scientist. There is no "fluff" or "clickbait," just pure actionable insights from a practicing Data Scientist.

Dishing The Data | Egor Howell | Substack

Connect With Me!


Related Articles