Journey to Tempered Stable Distribution #1

Understanding Fat-tailed Distribution

How fat is fat?

HW Roh
Towards Data Science
10 min readJun 17, 2020

--

Hi, this is Roh (ρ)

The series “Journey to Tempered Stable Distribution” is designed to help people understand one of the fat-tail distributions: tempered stable distribution. The purpose of this document is, therefore, to introduce and explain concepts and tools that are required to understanding the level of exploiting the tempered stable distribution for their own purpose. I will not get into the nitty-gritty of each type of fat-tail distribution, rather I try to explain the related statistical and mathematical concepts/issues in an intuitive way with some application in finance. I hope there is a useful takeaway for all the readers from different backgrounds. Feel free to ask any question through the email at the very end of this document.

  • Part0 : Why Tempered Stable (TS) Distribution? [Click]
  • Part1 : What is fat-tailed distribution?
  • Part2 : Infinitely Divisible Distribution? [Click]

In part 1, we discuss what it means for a random variable to have a “fat-tail” distribution.

Far? Fat?

To understand the fat-tail, we need to answer the following two questions.

1. How far is far?
2. How fat is fat?

To talk about the tail, we need to determine how far is far to decide how far from the middle is far enough to say it a ‘tail’. In other words, where does the tail start? It depends! Unfortunately, there is no single answer.

Consider the normal distribution. Note that there are two tails: right and left. If we want to describe the ‘right’ tail of the distribution from the one standard deviation from the mean, for example, then the shaded part refers to the right tail of the normal distribution.

Figure. 1

Formally, we can describe the tail as follows:

  • right tail : P(X>x)
  • left tail : P(X≤-x)

for a large value of ‘x’. Now, we know the concept of the ‘tail’.

[R codes for Tail]
#For normal distribution with value 'x=a'
a=1
1-pnorm(a) #right tail
pnorm(-a) #left tail

Does every distribution has a tail?

Think about the uniform distribution over [0,1]. Does it have a tail? In this blog, it says not every distribution has a tail.

If you want “ the behavior of the tail” to describe the characteristics of the pdf when ‘x’ gets large, then bounded distributions do not have tails. Nevertheless, some features of tails can be quantified. In particular, by using limits and asymptotic behavior you can define the notion of heavy tails. SAS blog

I will explain the (exponentially) bounded / not bounded distribution below. Please remind yourself of the uniform distribution when you get there!

Why should we care about the ‘tail’ part of distribution?

The tail part of distribution has been the main concern for risk management. For example, the two most heavily used risk measures for distribution of return or loss are Value at Risk (VaR) and Expected shortfall (ES)

Why loss not return?

  • loss is literally minus (-) return
  • Taking the limit to negative infinity is non-intuitive. So we take the negative of return values, i.e., turning the distribution over the y-axis.

Just see how the quantity VaR and ES are related to ‘tail’. Do not need to understand the math or meaning behind them.

“Be aware that the below graph is a distribution of Loss not Return!”

Figure. 2 // Source: Ch2, Quantitative Risk Management (hereafter, QRM) by McNeil et al

Think about distribution of loss, L, equivalently (negative) return, on some asset over a given holding period. For the sake of understanding, we assume that the random variable of losses on tomorrow follows the normal distribution:

Then, we can calculate the VaR in the following way:

Source: Eric Zivot’s Lecture Notes

Through the second line, we can easily check that the VaR is just a quantity related to the fat tail. For more details about the VaR, check chapter two of the book “Quantitative Risk Management: Concepts, Techniques and Tools” and Eric Zivot’s lecture note on his website.

[R codes for VaR]alpha = 0.95 #significant level
VaR.alpha = qnorm(alpha, mu, sigma)
VaR.alpha = mu + sigma*qnorm(alpha, 0, 1)

Similarly, we can see that expected shortfall is a quantity related to the tail part of the distribution:

Source: Eric Zivot’s Lecture Notes

In the fourth line, it says “ES is the expected loss in the upper “tail” of the loss distribution. Similar to VaR, in the case of normal distribution, it is convenient to calculate the ES now that it is just a mean of truncated normal distribution.

Source: Eric Zivot’s Lecture Notes
[R codes for ES]
alpha = 0.95
q.alpha.z = qnorm(alpha)
ES.alpha = mu + sigma*(dnorm(q.alpha.z)/(1-alpha))

If anyone curious about why we divide by 1 — α , this is just a normalizing constant (or scaling factor) to make sure that the integration of the truncated loss distribution is one, which is a requirement for it to be a probability distribution.

Back to the story of ‘tail’, I just wanted to emphasize that the tail distributions are widely used as risk management tool.

How fat is fat? How heavy is Heavy?

Since we figured out what the ‘tail’ is in distribution and where it is used, now it is time to talk about the ‘fat’ part. We all know that normal distribution does not have a fat-tail. Instead, we were taught to use the student-t distribution and log normal distribution when modelling the financial return series to take into account the ‘fat-tail’ property. But we need to know the definition of fat tail. Unfortunately, there is no universal definition for the term fat.

I will try to explain the fat-tail in the language of English, Graph, and Math. Hope you enjoy at least one of the three.

In the language of English,

  • A heavy tailed distribution has tails that are heavier than an exponential distribution (Bryson, 1974)
  • Distribution is said to have a heavy tail when the tail part decays more slowly than the exponential distribution.

Why exponential?

It is convenient to use the exponential distribution as a reference. The pdf of the exponential distribution approaches zero ‘exponentially’ fast. That is, tail of the pdf looks like (but behaves differently from) the exponential distribution.

In the language of graph,

I will show you 4 different graphs that show what happens in the far right tails of a set of different distributions as below:

  • Exponential distribution (exp)
  • Power-law distribution (PL)
  • Normal distribution (N)
  • Log-Normal distribution (LN)
  • Student-t distribution
  • Cauchy distribution
  • Levy distribution
  • Weibull distribution

I will not explain each of these distributions. Instead, let’s just enjoy the graph of these distributions to feel what is going on in the tail part. The first graph shows the part of the whole graph whose ‘x’ lies in [0,5]

Figure. 5, R codes for this graph is provided at the end of the document

With the figure 5 above, we cannot tell how the tail behaves. But, here are a few things that are worth mentioning

  • Normal, student-t and Cauchy distributions are two-tailed distributions. All others are one tailed distributions
  • For PL(2.5) and PL(3.5), there is a crossing over point near x=1.7, which indicates that PL(2.5) has a thicker tail.

Let’s look at how it looks when ‘x’ lies in [5,8]. Be aware that the values in y-axis get much smaller.

Figure. 6

Q: What do you see in this graph?

A: The most upper line would have the thickest tail! (But not quite!!!) And you will see why!

Beforehand, let’s examine the important facts of figure 6 above.

  • Normal and exp(2) distributions are crawling near 0 when x=5. Especially for normal distribution, its pdf value of 5 standard deviation is 0.000001486 (=pnorm(5)). This is around 8000 times smaller than that of Cauchy distribution. In other words, 5 sigma events are 8000 times more likely to happen under Cauchy distribution than Normal distribution.
  • In figure 6, keep in mind that exp(0.2) distribution locates way above log normal distribution and power law distributions. Please check how it gets reversed in the following graphs after extending the range of ‘x’ values.

Let’s see how it looks when ‘x’ lies in [8,100]. Again, be aware that the values in y-axis get much much smaller.

Figure. 7
  • Note that the blue line exp(0.2) decays fast while crossing the other two that are PL(2.5) and Cauchy. This is what it means by “decays slower than exponential distribution”
  • It is surprising to see what happens near ‘x’ equals 100. Its pdf value of PL(1.5) is 0.0005. No wonder that first and second moment (mean and variance) are infinite for PL(1.5). Detail information about this will be covered in the next document. Stay tuned!

Let’s zoom in y-axis to see how it behaves in detail!

Figure. 8
  • Surprisingly, the blue line exp(0.2) decreases by crossing the PL(3.5) and LN(0,1). Also, we can see that LN(0,1) decays faster than PL(3.5) now that it crosses the PL(3.5) and goes under it.
  • PL(1.5), PL(2.5) and Levy distributions are not even displayed in this graph.

In the language of Math,

‘Heavy’ vs ‘Fat’

Fat tail distribution is a subclass of the heavy-tailed distribution. It means although every fat-tailed distribution is heavy-tailed, the reverse is not true (e.g., Weibull). According to Jay Taylor’s lecture notes, he differentiated the heavy and fat in the following way.

Definition of Heavy tail

  • Distribution is said to have a right heavy-tail if tails are “not” exponentially bounded
Equation 1

We can interpret it as when ‘x’ gets large, the speed of exponentially increasing is faster than the speed of decreasing probability on heavy right tail. Take time to think about it!

See how it connects to the English definition.

  • Probability distribution function that decays slower than an exponential are called right heavy-tail.

When exponentially bounded?

If the heavy right tail is not heavy enough, i.e., it decays super fast as ‘x’ goes to infinity, then equation 1 converges to zero. The obvious example is uniform distribution over [0,1] as we discussed above. Once ‘x’ exceeds the one, the probability of X greater than one becomes zero so that it is exponentially bounded. Another popular example is the normal distribution. Let X be a standard normal. Draw a series of graphs for the different lambda values to get

Figure. 3

We can see that it converges to zero so that tails of the normal distribution are exponentially bounded.

[R codes for Figure. 3]
f_exp = function(x, lambda){return (exp(lambda*x))
cdf_normal = function(x) pnorm(x)
ccdf_normal = function(x) {1-cdf_normal(x)}
xs = seq(1,10,length=10000)
plot(xs, f_exp(xs,0.1)*ccdf_normal(xs), type='l', xlab='',ylab='', col='blue', lwd=2)
abline(v=1, lty = 'dashed')
lines(xs,f_exp(xs,0.5)*ccdf_normal(xs), col='purple', lwd=2)
lines(xs,f_exp(xs,1)*ccdf_normal(xs), col='red', lwd=2)
lines(xs,f_exp(xs,1.5)*ccdf_normal(xs), col='orange', lwd=2)
lines(xs,f_exp(xs,2)*ccdf_normal(xs), col='darkred', lwd=2)
lines(xs,f_exp(xs,3)*ccdf_normal(xs), col='darkblue', lwd=2)
grid()
legend(8, 0.15,
legend=c("0.1", "0.5","1","1.5","2","3"), title = "lambda",
col=c("blue",'purple', "red",'orange','darkred','darkblue'), lwd=2, cex=1)

Definition of fat tail

  • Distribution is said to have a right fat-tail if there is a positive exponent (alpha) called the tail index such that

The ‘~’ means same up to constant. Or the tail part is proportional to the power law. Precisely, it means the following.

Source : [click] and [click]

Feel free to skip if math is ‘heavy/fat’ for you.

Therefore, the tail part of fat-tailed distributions follows a power law (which is ‘x’ to the power of minus alpha). For those who are not familiar with a power law, do not worry now. Think of the graph when alpha equals two.

Figure. 4

Remind yourself that tail part looks similar to power-law as we have seen in figures 5–8 above. I will explain power law in more detail from [Part 2] of this series.

Summary

We went over the concept ‘fat-tail’ in this document intuitively, graphically, and mathematically. To understand the ‘tempered stable distribution’, it is necessary to have a fundamental understanding of the fat-tail. Hope this document was helpful to improve your understanding. Please comment below if you have any question. I hope you are curious about what is to come next. Next time, I will be back with “ Journey to Tempered Stable Distribution[Part. 2: Infinitely Divisible Distribution]

[R Codes of Figure. 5]
f_exp = function(x, lambda, xmin) {lambda*exp(-lambda*(x-xmin))}
f_power = function (x, k, x_min) {
C = (k-1)*x_min^(k-1)
return (C*x^(-k))
}
f_cauchy = function(x) dcauchy(x)
f_levy = function(x) dlevy(x) # required package: 'rmulti'
f_weibul = function(x) dweibull(x,shape=1)
f_norm = function(x) dnorm(x)
f_lnorm = function(x) dlnorm(x)
f_t = function(x) dt(x,5)
xs = seq(0.1,100,length=1000)
plot(xs, f_exp(xs,0.5,0.1),type='l',xlab='',ylab='', col='blue', lwd=2,
main='Distributions on [0,5]', cex.main=1,
xlim=c(0,5),
ylim=c(0,2.5))
lines(xs,f_exp(xs,1,0.1), col='purple', lwd=2)
lines(xs,f_exp(xs,2,0.1), col='bisque3', lwd=2)
lines(xs,f_power(xs,1.5, 1), col='red', lwd=2)
lines(xs,f_power(xs,2.5, 1), col='orange', lwd=2)
lines(xs,f_power(xs,3.5, 1), col='darkred', lwd=2)
lines(xs,f_norm(xs),col='black', lwd=2)
lines(xs,f_lnorm(xs), col='darkgreen', lwd=2)
lines(xs,f_t(xs), col='deeppink', lwd=2)
lines(xs, f_cauchy(xs), col='darkblue', lwd=2)
lines(xs, f_levy(xs), col='azure4', lwd=2)
lines(xs, f_weibul(xs), col='springgreen', lwd=2)
abline(v=2, lty = 'dashed')
abline(v=3, lty = 'dashed')
grid()
legend(3.5, 2.5,
legend=c("exp(0.2)", "exp(1)", 'exp(2)', "PL(1.5)", 'PL(2.5)', 'PL(3.5)', 'N(0,1)','LN(0,1)','student-t(5)','Cauchy','Levy','Weibull'),
col=c("blue",'purple', 'bisque3',"red",'orange','darkred', 'black','darkgreen','deeppink','darkblue', 'azure4','springgreen'), lwd=2, cex=0.8)

References:

[1] Jay Taylor, Heavy-tailed distribution (2013), Lecture notes,

[2] Eric Zivot, Risk Measures (2013), Lecture notes

[3] Aaron Clauset, Inference, Models and Simulation for Complex Systems (2011), Lecture notes

[4] https://blogs.sas.com/content/iml/2014/10/13/fat-tailed-and-long-tailed-distributions.html

I also added hyperlinks for the all references above. Please check the references for the detail information. I will update the reference later if there is anything that I missed.

Thank you for reading this document. Do not forget to share this document with your friend if you find it useful.

--

--

Call me “Roh” like correlation coefficient ρ. DataScientist @Fintech company #CreditScoring #Peoplefund @corr_roh