The world’s leading publication for data science, AI, and ML professionals.

Probabilistic Recreations of the First Digit Law

Nature creates patterns in numbers and dictates that some are more privileged in terms of appearance than others.

Photo by Possessed Photography from Unsplash
Photo by Possessed Photography from Unsplash

The first digit law, first digit phenomena, or leading digit phenomenon is a phenomenological law. According to Benford’s law, the leading digit 1 occurs more than other leading digits with a frequency of 30% in most of worldwide censuses. It is almost as if we had a meticulously rigged dice that always favors 1 over 2, 2 over 3, and so on. This is obviously different than the one digit out of 9 probability we intuitively would think of when we estimate the proportion in which every digit from 1 to 9 would appear.

Image By Author ( Benford's law )
Image By Author ( Benford’s law )

Benford’s law has long been regarded as a fascinating and enigmatic natural law.

Its explanations range from supernatural to measure-theoretical. Use cases relating to it span from fraud detection to computer disk space allocation.

Publications on the subject have increased in recent years, with the majority of them focusing on the examination of the law from various Data sources with applications in fraud and computer science. However, the basic cause of Benford’s Law somehow still remains a mystery.

This article aims not to provide theoretical evidence to justify the origin of such a law. As you may have noticed, it is a bit tricky to approximate from a probabilistic point of view.

No matter what you use as a base distribution and even if you take the first digits from the samples, the distribution is far from approximating Benford’s.

In a first step, we put forward 3 probabilistic approaches to get a Benford-like distribution. In the second part, we deal with 2 more realistic cases where Benford’s distribution is more likely to appear in real life.

So let us get into practice.

Three ways to simulate Benford’s Law:

1. Raising a uniform sample to the power of N

The first way to do it is to sample values from a uniform distribution between 0 and 1 and raise them to the power of n, n being a relatively large integer:

Image by Author
Image by Author

Here’s an example of how we would do this for a single sampled value :

Image by Author
Image by Author

We write down our sampling function which takes the boundaries of the distribution, the size of the sample, and the power to which each element of the sample is raised, all as parameters.

We write down an extract_first_digit as a function that takes the first leftmost digit of each number of the sample :

Let us put our first simulation into ignition :

Sample raised to the power of 1 ( Image By Author )
Sample raised to the power of 1 ( Image By Author )
Sample raised to the power of 2 ( Image By Author )
Sample raised to the power of 2 ( Image By Author )
Sample raised to the power of 50 ( Image By Author )
Sample raised to the power of 50 ( Image By Author )
Sample raised to the power of 100 ( Image By Author )
Sample raised to the power of 100 ( Image By Author )

The higher the exponent, the more the distribution looks like Benford’s law. You may try and replace the uniform distribution with another one by yourself. I am not quite sure you would find the same results at the end, as interesting as it might seem.

2. Dividing a uniform distribution by another one:

Image by Author
Image by Author

The second way to do it is to divide a sample from a uniform distribution between 0 and 1 by another one with the same parameters:

Image by Author
Image by Author

Let us recreate that :

A quick plot :

sns.histplot(history, stat="probability", discrete = True, color = 'b')
Image by Author
Image by Author

Unfortunately, just like the first simulation trick, you will not be able to have the same Benford-like distribution had you not used uniform distributions.

3. Construction of a Markov Chain :

The third way consists in constructing a Markov chain that is initiated with a random value sampled from a uniform distribution between 0 and 10 (10 not included preferably).

Then, each new state is formed by a new uniform distribution which is delimited by 0 and the value’s mantissa sampled from the previous state.

Image by Author
Image by Author

In layman’s terms, a mantissa is the first digit of a number and is represented between curly brackets. Here is a more concrete example :

Image By Author
Image By Author

Let us get into action and write some code :

Image by Author
Image by Author

Visibly enough, the digits’ different probabilities differ somehow from the official Benford’s probability values. The shape remains the same altogether: 1 is prone to be more frequent than other digits.


Two real cases where Benford’s law might appear:

Now that we have finished warming up, we will tackle two concrete use cases where Benford’s Law works its magic.

1. Price distributions in supermarkets

The first case involves prices in supermarkets, where it was found that, on the whole, the digit 1 appears more frequently to the left of the prices, than the digit 2 which itself appears more frequently than the the digit 3, and so on.

Our friend Alice comes in again to take us on a tour to a supermarket next to where she lives. She is used to running some errands there as she finds whatever she needs. Once arrived, she browses through the aisles and walks next to a buffet where some articles are offered with discounts.

Image by Author
Image by Author

Little does she know all encountered prices come from different distributions that give birth to a very special probabilistic law. Each distribution is considered a range of prices for variants of a single brand.

Image By Author
Image By Author

Let us help Alice trace it back.

First, let us define custom functions that draw sample distributions from which values are sampled.

For instance, we define a range of distributions for the normal distribution that are created through the range of means and variances we specify as entry parameters. In a future step, we would sample a mean and variance, we would create a gaussian instance and sample a value from it.

We write the same mechanism to recreate gamma and uniform distributions.

In the next step we run a series of sampling operations, each consisting of a random choice of either a uniform, gaussian or gamma distribution, take a sample from it, collect the first digit and store it in a list.

Once again, we are in front of a Benford distribution.

Image by Author
Image by Author

In 1998, Theodore P.Hill gave a rigorous demonstration that a sample taken from a mix of distributions follows Benford’s law. So this comes as no surprise.

2. Multiplicative fluctuation of a stock price

Let us look closely into the evolution of a stock price.

Considering it is multiplied each time by a random sample drawn from a different normal distribution,

Image by Author
Image by Author

we will be tracing each first digit of every new price and see if something arises from it.

Image by Author
Image by Author

Here’s a quick demo :

Once we have read through all the price records and stored all the digits, we plot the distribution ( keep your fingers crossed .. ):

Image by Author
Image by Author

Benford’s law appears in front of us once again.

In 2001, L.Pietronero, E. Tosatti, V. Tosatti, and A. Vespignani tackled this problem in their paper entitled Explaining the uneven distribution of numbers in nature in which the authors start with the study of multiplicative processes and make the analogy with the central limit theorem which, instead of dealing with multiplication, sums up random processes. They claim :

This exercise shows that the numbers N characterizing some physical quantities or objects naturally will follow Benford’s law if their time evolution is ruled by multiplicative fluctuations.


Closing thoughts :

Many mathematicians have succeeded in explaining the natural appearance of Benford’s law in common numbers.

Until today, the subject raises everyone’s curiosity, as thorough as the explanations may be. From a personal point of view, it is always nice to check the specialist’s findings with small and fun simulations that make the absorption of such concepts easier.

References :

  1. A Simple Explanation of Benford’s Law
  2. From Uniform Distributions to Benford’s Law
  3. Benford’s Law
  4. Explaining the Uneven Distribution of Numbers in Nature

Related Articles