By empowering engineers to reproduce detailed natural processes, computer simulation is transforming the design, analysis, and manufacture of industrial engineering practices. Despite its significant success, one persistent question bothers both analysts and decision-makers:
how good are these simulations exactly?
Uncertainty quantification, which stands at the confluence of probability, statistics, computational mathematics, and disciplinary sciences, provides a promising framework to answer that question and has gathered tremendous momentum in recent years. In this article, we will discuss the following aspects of uncertainty quantification:
- the motivation: where are the uncertainties coming from and why they are important?
- the solution: V&V and IPAC management framework
- the challenge: how hard to implement them in practice? How Machine Learning can help?
So let’s get it started!
Table of Content
· 1. Uncertainties In Computer Simulations ∘ Input data ∘ Model form ∘ Numerics · 2. Uncertainty Quantification Framework ∘ 2.1 Verification & Validation framework ∘ 2.2 IPAC framework ∘ 2.3 IPAC in practice: COVID-19 pandemic modeling ∘ 2.4 Uncertainty quantification is an iterative process · 3. Challenges in Applying Uncertainty Quantification · 4. The Future of Uncertainty Quantification · 5. Key Take-aways · Further reading · About the Author
1. Uncertainties In Computer Simulations
From predicting the rising sea level to designing the next-generation rocket engine, computational models have played a crucial role in describing the behaviors of those complex natural processes and yielding valuable insights to guide decision-making. In many cases, computational models appear as a set of mathematical equations (e.g., partial differential equations). For quite a long time, however, those equations can only be simulated for a handful of simple academic problems, thus possessing limited values to help people quantitatively understand reality. Starting from the mid-20th century, this situation has improved significantly, thanks to the rapid development of numerical algorithms and powerful computers.
Take the aviation industry as an example, it is now very common for engineers to employ computational fluid dynamic (CFD) simulators to optimize the aircraft shape, employ computational structural dynamic (CSD) simulators to determine the strength of aircraft’s wings in turbulence as well as employ computational aeroacoustic (CAA) simulators to predict the jet noise level. The end results of this simulation-driven product development? Fast product prototyping, reduced costly physical experiments, and a shorter time to market.
Even though the fancy post-processed simulation results are visually appealing, they are not enough to convince a skeptical decision-maker:
- If a structural simulation tells me that the bridge can withhold the wind, should I approve the building plan?
- If an aerodynamic simulation tells me that the engine would still generate enough thrust in rainy conditions, should I approve the installation of this engine on the plane for an inter-continent flying?
- If a weather simulation tells me that the hurricane would only have an impact on region A, should I also evacuate their neighbors in region B?
Apparently, when lives are at stake, decision-makers would expect computer simulations to be accurate and reliable, not just a bunch of colorful plots and animations.
As a matter of fact, decision-makers have every right to doubt the simulations, simply because there are many sources of uncertainty in computer simulations. The three major ones are as follows:
Input data
- Geometry uncertainty: induced by manufacturing tolerance, simplified geometry when doing the simulations, etc;
- Model parameter uncertainty: e.g., when simulating the cardiovascular system, we need elasticity parameters of the blood vessels. Clearly, those parameters vary between different patients;
- Boundary condition uncertainty: e.g., due to the random nature of the wind and ocean waves, the external forcing exerted on an offshore oil platform is highly uncertain, thus complicating the prediction of its structural stability.
- Initial condition uncertainty: e.g., initial weather states are required to predict the trajectory of the hurricane. However, due to the limited observations (satellites, weather balloons, etc) and the limited measuring accuracy, it is only possible to partially infer the initial state of a forming hurricane.
Model form
Also known as model bias or discrepancy, since the mathematical models that the computer tries to simulate, at the end of the day, is only an approximation to the true underlying physics. Sometimes this approximation is merely due to our lack of knowledge, other times it is because of our limited computational budgets (e.g., we can only afford to do a 2D simulation instead of a real-world 3D simulation).
Numerics
This source of uncertainty directly associates with the accuracy of the code used to simulate the mathematical models. Discretization error, iterative error, round-off error, and bugs or the coding error all induce numerical uncertainty.
Among those uncertainties, some of them are aleatory uncertainty, which arises due to inherent variation or randomness. Some of them are epistemic uncertainty, which arises due to a lack of knowledge. In other words, if sufficient knowledge is gained (via experiments, higher fidelity solver, etc.) then the uncertainty can be reduced. Regardless of their types, a common feature they share is that they all can deviate the simulation results from the true underlying physical process. Therefore, it is no wonder that
"Computational results are believed by no one, except the person who wrote the code."
2. Uncertainty Quantification Framework
Since we cannot avoid uncertainties, we need to quantify them, when we
- calibrating our computational models
- making predictions using the calibrated models.
Verification & Validation (V&V) framework applies to the first scenario, while an Identify-Propagate-Analysis-Control (IPAC) four-step management procedure applies to the second.
2.1 Verification & Validation framework
V&V framework aims to assess the credibility and establish the quality of the computational models. It contains two parts:
Verification: Are we solving the equation correctly?
Verification quantifies the uncertainties associated with numerical errors generated when using the code to solve the mathematical models. It is largely an exercise of computer science and mathematics and it usually involves comparisons of the simulation code with the corresponding exact analytical solutions.
Validation: Are we solving the correct equation?
Validation tackles the model form uncertainty. It is largely an exercise of physics and engineering and it usually involves comparisons of the simulation results with experimental measurements.
2.2 IPAC framework
After understanding the credibility of our computational model, the next step would be to use our model to make predictions. To effectively manage uncertainties involved in this process, an Identify-Propagate-Analysis-Control framework is usually proven to be productive.
Identify
This step aims to characterize various sources of input data uncertainty:
- For aleatory uncertainties, joint probability density functions (PDF) are employed to describe the distribution of the uncertain parameter;
- For epistemic uncertainties, probability-box, Dempster-Shafer evidence theory or fuzzy theory are all available methods to reflect the lack-of-knowledge.
Bayesian Statistics play a big role here. Since probability represents the degree of belief in the Bayesian paradigm, Bayesian analysis can easily handle both aleatory and epistemic uncertainties.
Propagate
This step is also known as the forward uncertainty propagation. It propagates all the input data uncertainties, as well as uncertainties quantified in V&V procedures, to the outputs of interests via running simulations. Our interests could either be assessing the variations of the output, or the probability that the output exceeds certain thresholds. The latter one is widely pursued in risk management, where determining the risk of undesired system performance (e.g., structural failure, system unstable, etc.) is the primary goal.
The Monte Carlo method provides a straightforward way to propagate uncertainties. By simply feeding the simulation code with different realizations of uncertain parameters, we can obtain an ensemble of simulation results, from which we can construct the histogram and extract relevant statistical indices.
Variance reduction variants of the vanilla Monte Carlo method are usually employed. This includes Latin Hypercube sampling, quasi-Monte Calo sampling, importance sampling, subset sampling, etc. In addition, spectral methods like polynomial chaos expansion provide an elegant formulation to obtain output statistical indices, which are much faster than the traditional Monte Carlo method.
Analysis
After estimating the total "damage" caused by all the uncertainty sources, a follow-up task would be to determine the most "guilty" ones, i.e., to understand which uncertainty sources contribute the most to the variance of simulation results, or which uncertainty sources drive the simulation results to go beyond thresholds. This importance ranking of the uncertainty sources is also known as the global sensitivity analysis.
Global sensitivity analysis apportions the total variance of the simulation output to different uncertainty sources and their interactions. The outcome of this sensitivity analysis is usually summarized in the form of the so-called Sobol indices. Parameters with larger Sobol index values contribute more to the variation of the simulation output, while parameters with smaller Sobol index values basically play no role.
Control
Finally, we take action to mitigate the adverse effects of various uncertain input sources and aim to obtain a reliable prediction of the output. Following the previous step, we can focus our energy on reducing the uncertainty level of the most "guilty" uncertainty source:
- If it is of epistemic type, we can perform more experiments or observe more samples in the first step to refine our knowledge;
- If it is of aleatory type, we can optimize the adjustable properties of the system that is being simulated, such that the output of the system becomes less responsive (or more robust) to the noisy aleatory uncertainty. This practice is also known as optimization under uncertainty.
2.3 IPAC in practice: COVID-19 pandemic modeling
To make the above-mentioned concepts more concrete, let’s consider how IPAC could help to quantify uncertainties associated with modeling COVID-19 pandemic.
Epidemic modeling uses mathematical models to simulate the spread of an infectious disease. There are many models out there, ranging from the fundamental Susceptible-Infected-Recovered (SIR) models to more sophisticated stochastic simulation models, like the ones used by the researchers from imperial college in their seminal work that changed UK’s coronavirus strategy.
Accurately simulating the course of an epidemic is not an easy task and various sources of modeling uncertainty have to take the blame. To begin with, model parameters like transmission rates are not fully known since SARS-CoV-2 is a rather new virus, much more research is needed to understand its transmission pattern. In addition, initial conditions, such as the initial value of infectives, are required to kick start the simulation. Unfortunately, those values are highly uncertain due to the poor documentation at the beginning of the outbreak.
To ensure the reliability of the simulation, we can follow the above-mentioned IPAC framework: First, we estimate those model parameters and initial conditions from the official data about the number of infected individuals, deaths, etc. For that purpose, Bayesian statistics (specifically, Markov chain Monte Carlo) can be employed to derive their joint probability distribution. This is the "Identify" step, and the outcome here is that we are now able to characterize the uncertainty regarding those model parameters and initial conditions.
Second, we perform forward uncertainty propagation. We run our epidemic model multiple times, and each time using a different sample of model parameters and initial conditions, drawn from the previously derived probability distribution. Based on the ensemble of simulation results, we are now able to quantify the uncertainty associated with the outputs of our interest, such as the reproduction number R (the number of people who can be infected by an individual), the duration of the epidemic, the total number of infected people and fatalities, etc.
Next, we can perform a global sensitivity analysis to understand which uncertain inputs are the most guilty ones, i.e., drive the most of the output variations, as well as which uncertain inputs have little influence that we can basically treat them as constants.
Finally, it’s time to take action. Based on the knowledge we gained from the previous step, we would know which field data should be further collected and how often to collect them, in order to better estimate the most "guilty" parameters. This would allow us to obtain more accurate and reliable epidemiological forecasts.
Of course, we don’t just stop here. The uncertainty-quantified epidemic model could serve as a valuable tool to assess the effectiveness of various non-pharmaceutical interventions (e.g., social distancing, closing schools, etc.), thus informing reliable policies that lower the reproduction number R and eventually help us win this battle.
2.4 Uncertainty quantification is an iterative process
Before we head to the next section, it is worth mentioning that in practice, uncertainty quantification is often implemented in an iterative manner. Here are some examples:
- a quick screening analysis (e.g. Morris screening) performed in "Analysis” step can inform practitioners which sources are non-influential, therefore no need to take them into account in "Propagate” step (dimensionality reduction!).
- in "Control” step, we have already calculated how much reduction of uncertainties in the input sources is required to lower the output variation to an acceptable level. This information is highly valuable in "Identify"step, as it tells us how much resources we need to allocate to better characterize the input uncertainty sources.
- Finally, a whole new analysis called data assimilation is established by iteratively performing forward uncertainty propagation ("Propagate” step) and Bayesian updating ("Identify" step) with real-world observations. Data assimilation is heavily employed in weather forecasting and acts as the hidden hero that making statements like "the probability that it will rain tomorrow is 30%." possible.
3. Challenges in Applying Uncertainty Quantification
The biggest challenge is the required high computational cost. This is largely attributed to the need to repetitively run the simulation with different realizations of uncertain inputs (a legacy of Monte Carlo family methods).
In industrial practice, a single simulation run is usually already quite expensive, let alone running multiple times. To give some numbers, for nowadays large-scale high-resolution simulations running on supercomputers, the computational time typically ranges from days to weeks, sometimes even months. Meanwhile, a proper Monte Carlo analysis requires at least several thousands of simulation runs to achieve statistical convergence. The conclusion is obvious, this is not the way out.
To address this issue, a lot of effort has been spent on the so-called surrogate modeling methodology. This is how machine learning finds its way into this domain. Gaussian process, support vector regression, and neural network are three popular examples. The goal here is to learn a relationship between the inputs and output of the computer simulation and embed this relationship into a surrogate model that is cheap to evaluate.
Afterward, Monte Carlo can be directly applied to this cheap-to-run surrogate model, thus potentially saving a considerable amount of computational budget.

Here is an example. The Gaussian process has already been applied to predict the instability phenomenon that occurs in modern aero-engine combustion systems. Various sources of uncertainty (e.g., manufacturing errors, stochastic operating conditions, etc.) exist and uncertainty quantification is much needed to deliver a reliable instability prediction. By training a Gaussian process model to approximate the expensive aero-engine combustor simulator, Monte Carlo simulations could be performed with a more than 100-fold increase in computational speed, thus significantly improved the efficiency of the uncertainty management process.
Though seems promising, surrogate modeling cannot avoid being a victim of the curse of dimensionality. The computational cost to fit an accurate surrogate model grows exponentially when the number of uncertain parameters increases. In this context, performing a model order reduction before training a surrogate model becomes imperative.
As a result, feature selection and model order reduction techniques, e.g., principal component analysis, various regularized regression techniques, etc., have gained increased popularity in recent years.
4. The Future of Uncertainty Quantification
Several trends become clear in recent years:
First of all, emerging concepts such as 3D printing and internet-of-things are transforming the manufacturing industry, and computer simulation with quantified uncertainty ensures the reliability of their practical deployment. Consequently, a growing effort is being spent on investigating the fundamental algorithms as well as the realistic implementations.
Secondly, UQ experts and domain experts are working more and more closely. Domain knowledge provides valuable insights on the input-output relation, thus compensating for the lack of training data for surrogate model building in UQ analysis. In addition, domain knowledge informs the sensitivity of the inputs, thus achieving a better allocation of computational resources.
Finally, the connections between machine learning and uncertainty quantification will be further strengthened. Evidence of that is more and more uncertainty quantification conferences start to set up dedicated minisymposia and workshops, inviting contributions that demonstrate the potentials of machine learning tools. Augmented by machine learning techniques, we can expect more affordable, accurate, and robust quantification of uncertainties in computer simulations.
5. Key Take-aways
Uncertainty quantification is essential for providing reliable simulation-based predictions in a wide range of engineering domains. Through this article, we have talked about:
- the sources of simulation uncertainties (input data, model form, numerical calculations), and their types (aleatoric and epistemic);
- V & V framework and "Identify-Propagate-Analysis-Control" framework to manage simulation uncertainties;
- machine-learning techniques offer promising ways to achieve more affordable, accurate, and robust uncertainty quantification.
In this article, we only discussed uncertainty management from a bird’s eye view. There are many technical details underneath that actually make managing uncertainties happen. We will discuss them in the following articles.
Further reading:
[1] Ralph C. Smith, Uncertainty Quantification: Theory, Implementation, and Applications, SIAM Computational Science & Engineering, 2014. [2] Ryan G. McClarren, Uncertainty Quantification and Predictive Computational Science, Springer, 2018.
About the Author
I’m a Ph.D. researcher working on uncertainty quantification and reliability analysis for aerospace applications. Statistics and data science form the core of my daily work. I love sharing what I’ve learned in the fascinating world of statistics. Check my previous posts to find out more and connect with me on Medium and Linkedin.