The world’s leading publication for data science, AI, and ML professionals.

Understanding climate variability…

…using a simple climate model, statistics and time series analysis

To predict and mitigate effects of the climate crisis, we need to understand both the average global warming over time as well as deviations from this mean. Tools from Physics and mathematics can help us analyse fluctuations of the climate, also called climate variability.

Data are available from climate observations and climate model simulations. Unfortunately, instrumental observations only date back to around the industrial revolution. Most climate model simulations are very complex and require supercomputing capacities. So, we wondered: How well can we reproduce the variability of global temperature with a simple mathematical model? In our recent publication, we approach this question using methods from physics, statistics and time series analysis.

Types of climate variability

There is externally driven (so-called "forced") and internal climate variability. External drivers ("forcings") refer to any processes that affect the Earth’s energy balance. They can happen externally to the Earth (e.g. changes in the sun’s intensity), or externally to the current climate (e.g. volcanic eruptions or human greenhouse gas emissions.)

On the contrary, internal variability is mainly caused by chaotic atmospheric or oceanic processes. This includes our daily weather, but also more slowly-varying fluctuations such as the El Niño–Southern Oscillation.

In our paper, we present a method to separate the effects of externally driven ("forced") and internal variability for global mean annual temperatures. To this end, we developed a statistical toolbox that is also freely available on GitHub (R-package "ClimBayes" __ and code for the paper). Additionally, we discuss variability on different timescales and distinguish which processes are most relevant for short-term and which for long-term projections.


Our Toolbox

Our analysis combines three methods: a simple model (1), a statistical fit (2) and fluctuation analysis (3).

1) The Simple Climate Model

The first tool is a very simple climate model, a so-called energy balance model. It computes the global annual mean temperature based on the balance of incoming and outgoing radiation. In the model, the Earth consists of an ocean with two layers: one upper layer that stores and releases heat rather quickly, plus a deep layer where heat exchange happens only slowly.

Mathematically, the model can be represented by a differential equation, in only two lines. The equation includes a deterministic and a stochastic part. The deterministic part will generate the forced variability and the stochastic leads to random internal variability.

The figure below visualises an example of the model’s variability for the period of 1850–2000. The time evolutions of the different forcings serve as inputs to the model (panels a and b). As an output, the model gives separate estimates of the forced and internal variability. The forced variability (panel c) shows a clear upward trend over the past 100 years that is due to anthropogenic emissions. The smaller dips are caused by volcanic eruptions – as more light-reflecting aerosols in the atmosphere lead to short cooling periods. The best guess of the internal variability is not just a single time series, but random samples, i.e. possibilities of how the variability might have been (panel d). Adding both estimates results in the total variability simulated by the model (panel e). Each of the three red/orange lines are equally likely options of how the temperature might have evolved according to our simple model.

2) Fitting the simple model to data

We compare the simple model to simulations from more complex climate models. Our goal is to understand how well our easy approach can approximate the more complex ones. For the analysis, we need a statistical method to fit the simple model to other temperature data. That is our second tool: Our "fitting method" optimises the parameters in the simple model such that it matches the data best.

For experts: We do this with a Markov Chain Monte Carlo algorithm, a tool from Bayesian statistics. It calculates the so-called posterior distribution of the model’s parameters conditioned on the temperature data.

To exemplify the fitting, we consider again the past 150 years (see figure below). The temperature data in this case are observations (in grey), i.e. measurements from weather stations. In the left panel, the simple model is not yet fitted to the data and it largely underestimates the warming trend. In the right panel, the model (blue, only forced variability shown for better visibility) was fit to the observation data, and we see that it can better reproduce the rise in temperature.

3) Investigating fluctuations on timescales

Our final step is to compare the temporal fluctuations in the simple model to those in the complex models. For this, we use a method to investigate variability on different timescales. Here enters our third tool: spectral analysis. It takes a time series as an input and quantifies the fluctuations as a function of the timescale.

Spectral analysis works like a prism that decomposes the white light into its many different colourful components, associated to distinct wavelengths. Similarly, spectra of temperature can reveal fluctuations over different "temporal lengths" such as days, months, years or decades.

By aggregating the fluctuations in certain intervals of timescales, we obtain estimates of the fluctuations on years, decades, several decades and centuries.

To introduce the concept, we show an example in the next plot: from two artificial time series (left panels) to the fluctuations (middle panels) and their ratio (right panel).

The time series in the top example has a higher fluctuation on the timescale of around one decade compared to it’s variability on several decades. The lower time series was constructed from the upper, but has an additional slowly-varying sinusoidal mode. As a result, it has a higher fluctuation on the timescale of several decades.

To compare the fluctuation values of the two time series, we can compute their ratio: It’s shown in the plot on the right and is the ratio of fluctuations of the top divided by the bottom. In this case, the ratio is above one for the shorter and below one for the longer timescale. That means that our spectral analysis successfully detected the slowly-varying modulation in the bottom signal.


How our simple model describes climate variability

We now have a toolbox ready that consists of three tools. First, a simple climate model that computes estimates of the forced and internal climate variability. Second, a method to fit the simple climate model to data from more complex climate models. Third, spectral analysis, which helps us to decompose fluctuations in our time series on different timescales. To our knowledge, we were the first ones who combined these tools in this way, so setting up this toolbox is already exciting. But what can we use these tools for? Let’s finally get to the main results of the paper:

We took this toolbox and applied it to simulated temperature data from the last millennium (850–1850). This is the longest period for which we have a large set of climate model simulations and reconstructions of the forcings which are required as a model input (excluding 1850–2020 has merely technical reasons).

During this period, climate variability was dominated by short cooling periods due to frequent heavy volcanic eruptions, as seen in the many spikes in the following example. We show a simulation of a complex climate model (grey) and our simple climate model (blue for the forced variability and red/orange for samples of the total variability). In the paper, we investigate several complex climate models and have one such plot for each of them.

To analyse the fluctuations, we apply the spectral analysis as explained above: We calculate the values for the different timescales for the simple model’s forced variability (blue curves above), the simple model’s forced + internal variability (red/orange curves above), as well as the complex model. Next, we divide the fluctuations of the simple model by those of the complex model and obtain the following ratios:

Each dot represents one experiment with one of the complex climate models. The dots correspond either to the ratio of the simple model’s forced variability and the complex climate model (blue dots) or the ratio of the simple model’s forced+internal variability and the complex climate model (orange dots). The bar shows the mean over the various dots.

We find that the difference between the forced (blue dots / bars) and the total variability (orange dots / bars) is largest on shortest timescales. This implies that the internal variability matters most on shortest timescales, which also has been found in other studies.

Most importantly, we observe that the mean ratio of the forced+internal varibility (orange bars) is close to one on all timescales. That implies that our simple model is doing well! The largest differences (biggest spread of points in the plot) occur at shorter timescales, where internal variability dominates over influences from forcings. This is not surprising, as our simple model has a very simplistic representation of internal variability – too simplistic to correctly approximate atmospheric processes. More surprisingly, for long-term fluctuations, our simple model provides a decent approximation. This is not to say we should stop running more complex climate models. Our simple model is restricted to global annual mean temperatures. For local effects and other climate parameters such as rainfall and wind, comprehensive models are irreplaceable.

To summarise, our paper presents a successful combination of tools from physics, Statistics and time series analysis to investigate climate variability. Using these, we saw that, for global annual mean temperature during the last millennium, a simple climate model can describe the variability from more complex climate models with reasonable accuracy. The manuscript includes more aspects, such as a comparison of climate models with different levels of complexity. You can find it published in Chaos, and the code on GitHub (R-package "ClimBayes" __ and code for the paper). Go check it out and thank you for reading!

Finally, gratitude goes to my coauthors Beatrice Ellerhoff, Robert Scheichl and Kira Rehfeld, the entire SPACY team as well as Beatrice, Jonathan and Jeff for very helpful comments on my first drafts.


Sources:

  • All images unless otherwise stated are by the author.
  • Temperature data for 1850–2000: C. P. Morice, J. J. Kennedy, N. A. Rayner, J. P.Winn, E. Hogan, R. E. Killick, R. J. H. Dunn, T. J. Osborn, P. D. Jones, and I. R. Simpson, "An updated assessment of near-surface temperature change from 1850: The HadCRUT5 data set," J. Geophys. Res.: Atmos. 126, e2019JD032361, https://doi.org/10.1029/2019JD032361 (2021).

  • Forcing data: G. A. Schmidt, J. H. Jungclaus, C. M. Ammann, E. Bard, P. Braconnot, T. J. Crowley, G. Delaygue, F. Joos, N. A. Krivova, R. Muscheler, B. L. Otto-Bliesner, J. Pongratz, D. T. Shindell, S. K. Solanki, F. Steinhilber, and L. E. A. Vieira, "Climate forcing reconstructions for use in PMIP simulations of the last millennium (v1.1)," Geosci. Model Dev. 5, 185–191 (2012).
  • Complex climate model data used here (HadCM3 model): A. P. Schurer, S. F. B. Tett, and G. C. Hegerl, "Small influence of solar variability on climate over the past millennium," Nat. Geosci. 7, 104–108 (2014).

Related Articles