Hands-on Tutorials

Poisson Process and Poisson Distribution in Real-Life: Modeling Peak Times at an Ice Cream Shop

Carolina Bento
Towards Data Science
12 min readApr 27, 2021

--

Image by author.

Several phenomena in the real world can be represented as counts of things. For example, the number of flights departing from an airport, number customers lining up at the store register, the number of earthquakes occurring in a year at a specific region.

Counting events is a relatively simple task, but if you want to go from just counting the occurrence of events to asking questions about how likely are these events to happen in a specific unit of time, you need more powerful tools like the Poisson distribution.

The Poisson distribution models the probability that a given number of events, from a discrete random variable, occur in a specific time interval.

Named after the prolific mathematician Siméon Denis Poisson, the Poisson distribution is a discrete probability distribution. It models the probability that a number of events, from a discrete random variable, occur in a specific time interval.

Siméon Denis Poisson (Image Credit)

Probability Distribution of a Discrete Random Variable

A discrete random variable describes an event that has a specific set of values[1].

For instance, the discrete random variable that represents tossing a fair coin can only have the values heads or tails. The discrete random variable that represents picking a card from a deck of cards can only have 52 possible values, 2 of Hearts, 9 of Clubs, Queen of Diamonds, Ace of Spades, and so on.

The probability distribution of a discrete random variable is called Probability Mass Function (PMF). It’s a function that maps each value the random variable can take to its corresponding probabilities.

For example, the probability mass function of a random variable that follows a Poisson distribution looks something like this.

Example of the Probability Mass Function of a random variable that follows a Poisson Distribution.

The probability mass function has three fundamental conditions:

  • All probabilities are non-negative.
  • The probability of each event must to be between 0 and 1. The probability of tossing a coin and getting heads must have a value between 0 and 1.
  • The sum of probabilities of all possible values must be equal to 1. The sum of the probability tossing a coin and getting heads and the probability of tossing a coin and getting heads must be equal to 1.

Poisson distribution and Machine Learning

In Machine Learning, the Poisson distribution is used in probabilistic models. For example, in a Generalized Linear Model you can use the Poisson distribution to model the distribution of the target variable.

In Machine Learning, if the response variable represents a count, you can use the Poisson distribution to model it.

In real-world applications, these models are used to predicting or simulate complex systems, like the extreme weather events[2] or the cascades of Twitter messages and Wikipedia revision history[3].

Modeling peak times at your friend’s ice cream shop

Your long-time friend Jenny has an ice cream shop downtown in her city.

In multiple situations she has told you that one thing she’s always paying attention to is how to staff the shop. Jenny wants to make sure every customer has a minimal wait time and there’s always someone to help them, so the customer experience is the best they can provide.

But, at times, that hasn’t been the case. Jenny has learned the hard way that when there’s more than 10 customers at the store, there’s not have enough staff to help them and some customers end up leaving frustrated with the long wait and lack of assistance.

You’re a Data Scientist, and very good friends with Jenny, so you’re the first person she has turned to for help. Ultimately, Jenny wants you to help her figure out how many customers she should expect at her shop in any given hour.

After thinking about it for a while, you decide to reframe her question, so it’s more in line with what Jenny really wants to know, how likely is it that 10 customers will be at the shop at the same time, in any given hour.

Reframing this as a probability problem, you define a random variable that is Customer arriving at Jenny’s ice cream shop. This immediately makes you think about modeling the problem with the Binomial Distribution.

Binomial distribution

The Binomial Distribution describes the number of successes in a sequence of Bernulli trials. So if you think about a customer entering the shop as a success, this distribution sounds like a viable option.

But before you can model the random variable Customer arriving at Jenny’s ice cream shop you need to know the parameters of the distribution.

The Binomial distribution has two parameters:

  • n: the total number of Bernulli trials.
  • p: the probability of success in a Bernulli trial.

To answer the question how likely is it that 10 customers will be at the shop at the same time, in any given hour, you need use the Binomial distribution’s probability mass function. It looks something like this:

Probability Mass Function of the Binomial Distribution.

So far you only have the parameter k, the total number of customers.

But you remember Jenny told you about the series of studies the business district last year. In one of these studies they found that, on a regular business day, about 7,500 people walk by downtown and there’s a 10% chance a passerby enters one of the 15 downtown shops.

This is exactly the information you needed!

In this case, each downtown passerby represents a Bernulli trial where success mean entering a shop. The total number of customers that walk by downtown corresponds to n, and each customer has the same probability p of entering Jenny’s store, 10% according to the study.

Going back to the question how likely is it that 10 customers will be at Jenny’s shop at the same time you just need to plug-in the parameters in the Binomial probability mass function.

Doing these calculations by hand is challenging. You can use Python’s SciPy module to do all the heavy lifting.

from scipy.stats import binomk = 10
n = 7500
p = 0.10
print("Probability of having 10 customers at the shop")
print(binom.pmf(k, n, p))

This is a very small probability and, in fact, it’s not exactly what Jenny is looking for.

The important detail is that Jenny wants to know the probability of having 10 customers at the store at the same time.

The Binomial distribution doesn’t model events that occur at the same time. Instead, the successes occur in a sequence of n trials. So, in the end, the Binomial distribution is not the best to model this problem.

The Binomial distribution doesn’t model events that occur at the same time. Instead, the successes occur in a sequence of n trials.

Thinking through this limitation of the Binomial distribution and what tools you can use to answer Jenny’s question, you remember the Poisson Paradigm, also called the Poisson Approximation.

Poisson Paradigm

This Poisson paradigm states something like this:

When you have a large number of events with a small probability of occurrence, then the distribution of number of events that occur in a fixed time interval approximately follows a Poisson distribution.

Mathematically speaking, when n tends to infinity (n→ infinity) and the probability p tends to zero (p→ 0) the Binomial distribution can approximated to the Poisson distribution.

This approximation assumes that events are independent or weakly dependent. For instance, if events are independent, knowing that Adam entered the shop doesn’t give you any information about Andrea entering the shop as well.

But, in the real world, some events are most likely not completely independent. If Adam and Andrea enter the store that can give me some information about Bianca entering the store as well. These events are not independent, they are weakly dependent.

As long as events are independent or weakly dependent, this assumption holds and you can approximate the Binomial to a Poisson distribution.

Poisson distribution

Knowing about the Poisson Paradigm makes you more confident about using the Poisson distribution to model the number of customers entering Jenny’s shop.

The Poisson distribution describes the probability of a number of independent events that occur at a specific rate and within a fixed time interval. Unlike the Binomial, it only has one parameter lambda, the rate at which the event occurs.

Since it’s all about events that occur at a specific rate, the probability mass function looks something like this:

Probability Mass Function of the Poisson Distribution.

So, to answer the question What is the likelihood that 10 customers will be at Jenny’s shop at the same time? the last piece you need is the rate at which customers enter the store.

You don’t have that specific data point for Jenny’s store but, from the study the business association did, 10% of the 7,500 people passing by downtown in a given day entered a store.

Using all the data you have, you can say that 10% of those 7,500 customers enter the 15 downtown shops during the 10 hours they are open. So, you can calculate lambda and determine that approximately 5 customers per hour enter Jenny’s shop, i.e., one customer entering every 12 minutes.

Calculating lambda, the rate at which customers enter a downtown shop.

To answer Jenny’s question, you can plug the parameter lambda in the Poisson probability mass function.

There’s a 1.8% chance that 10 customers will be at Jenny’s store at the same time in any given hour. That’s a relatively low value, compared to what Jenny was thinking!

Plotting the probability mass function you also see the probability of having up to 10 customers at the same time at the shop.

Probability mass function of a Poisson distribution with a rate of 5, observing up to 10 events.

With the current rate of downtown customers entering a shop, Jenny can be prepared to have 4 or 5 customers at the shop, most of the time.

Here’s how you calculate and plot the Poisson probability mass function with Python’s SciPy module.

from scipy.stats import poisson
import numpy as np
import matplotlib.pyplot as plt
def poisson_pmf(k, lambda_val):
'''
Calculates and plots the Poisson probability mass function (PMF
:param k: Number of events to occur at the same time
:param lambda_val: Lambda value. rate at which the events occur
:return:
- Print out the probability that k events occur at the same time with a rate lambda value
- Plot the PMF from 0 to k occurrences
'''
x = np.arange(0, step=0.1, stop=k + 1)
pmf = poisson.pmf(k=x, mu=lambda_val)
print("Poisson:: Probability of having 10 customers at the shop")
print(np.round(poisson.pmf(k=k, mu=lambda_val), 3))
# plotting the PMF
fig, ax = plt.subplots(1, 1, figsize=(12, 9))

# removing all borders except bottom
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.bar(x, pmf * 100, color='#0F72AC')
plt.xlabel('Number of customers at the shop', fontsize=12, labelpad=20)
plt.ylabel('P(X=k) | Probability of k occurrences', fontsize=12, labelpad=20)
plt.show()
poisson_pmf(k=10, lambda_val=5)

Jenny was really excited to know there was approximately 2% chance the store gets so crowded they have a hard time providing quality service.

But she was also a bit worried.

There’s a 3-day weekend coming up, and Jenny knows that, on days like these, she’s of better help in the kitchen. That way she can guarantee there’s not a shortage of ice cream, while the rest of the staff takes care of the storefront.

When the kitchen is really busy, Jenny only gets to check the storefront every hour. So she asks for your help again, to figure out what is the probability of having 10 customers at the store on that day.

Poisson process

Wait, what? Didn’t I answer this question already?, you might think. Not quite.

Thinking about it a bit more, you realize there’s a new component here, Jenny will be checking the storefront at specific time intervals throughout the day.

While the probability mass function of the Poisson distribution provided you with the probability of having 10 customers at the shop at the same time, the time interval was fixed. You were looking at one given hour of the day, because that’s what the rate lambda gave you.

Now Jenny’s is going to check on the storefront multiple times throughout the day. So you need a tool that still counts events, i.e., customers entering the store, but in a continuous time frame.

You need to redefine Jenny’s question as a Poisson process.

The Poisson process is a statistical process with independent time increments, where the number of events occurring in a time interval is modeled by a Poisson distribution, and the time between the occurrence of each event follows an exponential distribution[2].

In practice the Poisson process describes the occurrence of an infinite number of independently and identically distributed events, each described by a random variable that follow a Poisson distribution[4].

These events, typically referred to as arrivals, can occur at arbitrary times so the probability that the event occurs in a specific point in time is zero. That’s why the Poisson distribution focuses on the time between events or arrivals, the interarrival time.

Arrivals of four different random variables during five units of time.

To recap, the Poisson process is a counting process with:

  • At least one arrival, or event, occurring in the time interval t.
  • Non-negative rate lambda, like the Poisson distribution.
  • Independent time increments.
  • Interarrival times that follow an exponential distribution.
  • Memorylessness, so the time to the next arrival is independent of previous arrivals. This means the next arrival doesn’t know anything about what happened in the past.

Back to Jenny’s question!

You’re going to use a probability mass function that is very similar to the PMF of the Poisson distribution.

In a Poisson process, the expected number of arrivals combines both the rate lambda and the time interval you are interested in. Because you are interested in the events that occur in a continuous time.

You also have to take into account the interarrival times. Even thought two events can’t occur simultaneously, they can occur at arbitrary times, within the same time interval.

Probability mass function of a Poisson process.

When you are looking at just any given hour, the smallest unit time in this case, the Poisson process is equivalent to the probability mass function of the Poisson distribution. Given that you have a total time interval t is equal to 1.

But you want to calculate the probability for the entire day. Knowing that Jenny is going to check-in on the storefront every hour, the total time interval t is equal to 10.

These calculations are too error prone to do by hand. So you can turn to Python again and code the probability mass function for the Poisson process.

import mathdef poisson_process(lambda_value, k, t):
'''
Calculates the probability mass function of a Poisson process
:param lambda_val: Lambda value. rate at which the events occur
:param k: Number of arrivals to occur at the same time
:param t: time interval to observe arrivals
:return:
- Print out the probability that k arrivals occur at the same time with a rate lambda value during time t
'''
numerator = np.power(50, 10) * math.exp(-50)
denominator = math.factorial(10)

print("Poisson process with\n\tlambda=" + str(lambda_value) + ", " + str(k) + " arrivals, during time interval of " + str(t) + " hours")
print(numerator/denominator)
poisson_process(lambda_value=5, k=10, t=10)

Thanks to your help Jenny is much more confident that customers visiting her shop during that 3 day weekend will get the best experience Jenny and team can offer!

The probability of having 10 customers entering the shop at the same time during the 10 hour period they are open is very small!

Now you know how to model real world systems and phenomena that are based on event counts!

With the Poisson distribution you calculated the probability of events occurring in a discrete, as in fixed, time interval. Then, expanded that to a continuous time frame, like the duration of a day, with the Poisson Process.

Hope you enjoyed learning how the Poisson distribution and the Poisson process are applied in real life scenarios.

Thanks for reading!

References

[1] Probability Distributions for Discrete Random Variables (Shafer and Zhang) 2021. January 10, 2021

[2] Clementine Dalelane, Thomas Deutschländer, A robust estimator for the intensity of the Poisson point process of extreme weather events, Weather and Climate Extremes, Volume 1, 2013, Pages 69–76

[3] Simma, Aleksandr & Jordan, Michael. (2010). Modeling Events with Cascades of Poisson Processes. Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010. 546–555.

[4] Bertsekas, Dimitri, and John Tsitsiklis. Introduction to Probability. 2nd ed. Athena Scientific, 2008

Images by author except where stated otherwise.

--

--