Servitization and Queueing Theory: Deriving M/M/1 Model

Edward Elson Kosasih
Towards Data Science
7 min readFeb 1, 2020

--

Notice what’s common among the two? Both are queues! Image sources: https://www.flickr.com/photos/psit/5605605412 and https://www.geograph.org.uk/photo/5831116

Servitization is a phenomena where manufacturing firms shift from selling pure products to offering solutions (services) instead. Neely (2013) provides a brief introduction on how companies across industries are adopting this business model.

What’s interesting is that servitization also results in a less clear boundary between manufacturing and service firms (read Vandermerwe and Rada, 1988 for a good review). In fact, we can see that some aspects of manufacturing and services can be abstracted into similar mathematical concepts. For example, compare an assembly line with a group of people lining up to buy food. Notice that both are basically queues and are typically characterised by similar measures (throughput or how fast does the seller serve customers, queue length or how many people are waiting, etc…).

I recently learned that there’s a branch in mathematics that studies this exact problem, called the queueing theory. According to the theory, any queue can be modeled by these 6 parameters: A | B | c | D | E | F. A is arrival process, B is server process , c is number of servers, D is queue capacity, E is population size and F is queueing discipline. In this blog post, we’ll focus on the simplest queue of type M|M|1|∞|∞|FIFO or usually abbreviated as M|M|1.

M|M|1 Model

Imagine a queue with infinite capacity () i.e. there’s no limit on how long can the waiting line be. Assume that the population size is also infinite (∞) i.e. the number of potential customers is limitless. Clients are served on a First-In-First-Out (FIFO) basis. Let‘s assume that we’re operating a line with only one server (1) e.g. think of a basic food truck with 1 queue line. We can then model the customer arrival rate with a Poisson process of rate λ (M). We also assume that our service time is exponentially distributed, hence can be modelled with a Poisson process of rate μ (M) as well.

We represent this process as a Markov Chain. The state (node) represents how many items are at the queue, while transition rate shows the probability of receiving/serving a customer. For example S2 means that there are 2 items in the queue on that state. Since there’s only 1 server, items can only be processed sequentially, 1-by-1. Arrival rate of λ means that on average, in 1 time unit, λ items will arrive to the queue. Service rate of μ means that on average, in 1 time unit, μ items will be served and hence removed from the queue.

M/M/1 Queue Markov Chain

Suppose that arrival and service happens synchronously, that is we move from one state to another in the same time window of h, the Markov transition probability can be written this way.

Our goal is to calculate the probability of being in any state Sn in the long run based on just λ and μ (doesn’t matter what the initial state is).

To do that, we have to first marginalise P(Sn) for n = 0. We can marginalise P(Sn) in 2 ways, either with respect to previous states or subsequent states. Notice that S0 has 1 previous state (S1) and 1 next state (S1) as well. This is true only in the long run.

After substituting and rearranging terms, we obtain a relationship between P(S1) and P(S0). We now perform the same marginalisation on P(Sn) for other n > 0. In this case, Sn has 2 previous states (Sn-1, Sn+1) and 2 next states (Sn-1, Sn+1) as well.

We now have an equation relating P(Sn+1), P(Sn) and P(Sn-1). We can calculate this value for n=2. Substituting P(S1) with the result that we get previously results in the following equation.

Now we can do the same substitution and rearrangement for higher value of n, resulting in the following iterative equation (more rigorous proof is needed; we’ll skip that in this post).

We’re almost there! The only problem is we still have an unknown term of P(S0). The good news is we can characterise P(S0) based on λ and μ as well. To do that, we’ll make use of the sum property of probability i.e. the sum of all state’s probabilities is equal to 1. Since we have infinitely many possible states, the summation term goes to infinity.

The equation shows that this convergence of sum is only true if arrival rate is smaller than service rate. Intuitively this means that we’ll have a functioning queue only if we can service the customers faster than their arrival rate. Otherwise, the queue will grow indefinitely.

We can now characterise P(S0) based on λ and μ. Substituting this value to the iterative equation of P(Sn) yields the following result.

We have achieved our original goal of measuring the probability of being in any state based on arrival and service rate alone.

So what?

With this model, we can return to our original questions and attempt to answer them. For example,

Suppose we open a food truck every wednesday. Having done this for the past few months, we have a rough idea of how often do the customers come and how fast can we serve them. Using our new knowledge of queueing theory, we can now estimate how long will our queue be on average. We can then compare this number with the field size on which we’re operating and see if there’s enough space for people to queue. We can also decide how much investment should we make in terms of speeding up service time to cut down the queue size.

This use case is mathematically equivalent to calculating the expected number of items in the queue N.

Unfortunately we have an infinite sum term in the equation. To find out what this is equivalent to (again assuming that λ < μ), we’ll make use of the derivation of a geometric series sum.

We can now substitute this to the original equation, resulting in an expected value that is characterised based on λ and μ alone.

Let’s return to our food truck example. Suppose 4 customers arrive per hour to our food truck and we have enough capacity to serve 6 of them per hour. Using the above equation, we can calculate the expected number of people in our queue across time.

On average, we expect 2 people to queue in our waiting line. We can calculate also calculate more KPIs like server utilisation, average waiting time, average time spent in the queue all based on λ and μ alone.

Conclusion

We have seen how queueing theory can help us characterise a waiting line. This mathematical abstraction can be used by both manufacturing and service firms. In practice, we might want to model our queue with more realistic parameters e.g. instead of 1 server, we could have multiple servers; or we can assume limited capacity instead of infinite. The model could then provide managerial insights on how to improve the queueing system.

The main limitation would be that there’s a strong assumption on how the arrival and service process are like. We assumed Poisson process in this case, but a better approach is to be more data-driven and see which distribution fits better. Bear in mind that using parametric distributions (either Poisson or anything else) requires one to fully understand the underlying presuppositions that each model presumes.

--

--