The world’s leading publication for data science, AI, and ML professionals.

Monte Carlo Method Explained

In this post I will introduce, explain and implement the Monte Carlo method to you. This method of simulation is one of my favourites…

Understand the Monte Carlo method and how to implement it in Python

https://unsplash.com/photos/tV3Hh38eoSg
https://unsplash.com/photos/tV3Hh38eoSg

In this post, I will introduce, explain and implement the Monte Carlo method to you. This method of simulation is one of my favourites because of its simplicity and yet it’s a refined method to resolve complex problems. It was invented by Stanislaw Ulam, a Polish mathematician in the 1940s. It was named after a gambling town in Monaco because the principles of randomness mimic a game of roulette. Monte Carlo simulations are a very common concept to quantify risk in various areas like stock prices, sales forecasting, predictive modelling, etc.

How does the Monte Carlo Method Work?

Monte Carlo simulations are a method of simulating statistical systems. The method uses randomness in a defined system to evolve and approximate quantities without the need to solve the system analytically. The main concept implied in this method is that a point in a moving system will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This is known as ergodicity.

Image provided by author
Image provided by author

The model predicts by using a range of values in the domain of the problem rather than a specific input. This method leverages distributions of probability (normal, gaussian, uniform, etc.) for any variable which has uncertainty. Based on the number of trials specified, this process of using random values in a domain is repeated numerous times. Generally, the larger the number of trials, the higher likelihood the outcome will converge to a value. Commonly used in time series analysis for long term predictive modelling. Once all the simulations are complete, you will have a range of possible outcomes with the associated probability of each result occurring.

Example

Of the many ways to explain this method, the most common example to explain Monte Carlo simulations is called the Buffon Needle Experiment to approximate the value of π. The experiment is as follows, we randomly drop N number of needles of size L onto a piece of paper which is divided by parallel strips of length 2L.

Buffon's Needle Experiment - provided by author
Buffon’s Needle Experiment – provided by author

After randomly dropping these needles, identify the number of needles which touch the lines dividing the paper and the total number of needles dropped (N).

π ≈ N / number of needles crossed line
Note : 
1) A large amount of needles must be dropped to have a close approximation of π.
2) This formulation strictly works because we initially stated that the distance between the lines was 2 * L (where L is the length of the needle)

An excellent in-depth explanation of the mathematics behind this can be found [here](https://ogden.eu/pi/) and you can run the experiment for free here.

Be cautious, this example is just to explain the Monte Carlo method. The Monte Carlo method can be used in many different situations but is not always advised. Although this approach works, in actuality this is a poor use case of the Monte Carlo method. There are many other ways one can approximate the value of π and most of them are a lot more computationally efficient.

The situation when you should use this method is when you need to estimate an outcome where there is a high level of uncertainty in the result. This methodology is commonly used in the finance industry for stock forecasting due to the level of randomness and uncertainty in the stock market. Due to these constraints, a model like this is favoured and often out performs a common regression based approach. In situations of uncertainty, this method is quite effective.

Algorithm

  1. Identify the independent and dependent variables and define their domain of possible inputs.
  2. Determine a probability distribution to randomly generate inputs over the domain
  3. Compute the output for the problem based on the randomly generated inputs
  4. Repeat the experiment N number of times and aggregate the results

It’s common practice to calculate the variance and standard deviation when conducting this experiment. Generally, the smaller the variance, the better

Advantages & Disadvantages

I will outline a few of the most notable advantages and disadvantages of using this method.

Advantages

  • Strong way of estimating uncertainty
  • Given the correct boundaries, this model can survey the parameter space of problem
  • Simple & intuitive, this approach is quite easy to understand

Disadvantages

  • Computationally inefficient – when you have a large amount of variables bounded to different constraints, it requires a lot of time and a lot of computations to approximate a solution using this method
  • If poor parameters and constraints are input into the model then poor results will be given as outputs

Python Implementation

Summary

In summary, this article outlines that the Monte Carlo simulations are a method of simulating statistical systems. They utilize randomness in a defined system to evolve and approximate quantities without the need to solve for it analytically. This method is best used when there are high levels of uncertainty. Although it is quite computationally inefficient, it is very intuitive to understand, can survey a large sample of the constraints of the problem and can effectively approximate uncertainty. Due to these reasons, it is commonly used in the finance industry.


Resources


If you enjoyed this read then check out my other works as well.

Text Summarization in Python with Jaro-Winkler and PageRank

Link Prediction Recommendation Engines with Node2Vec

Word2Vec Explained

Markov Chain Explained

K Nearest Neighbours Explained


Related Articles