The Significance and Applications of Covariance Matrix

One connection between linear algebra and various applications

Yitong Ren
Towards Data Science

--

“The beauty of math is that simple models can do great things.” There is no lack of fancy algorithms and techniques in modern data science. Technology is easy to learn, but also easy to fall behind. The mathematical underpinnings, however, can benefit in the long run. Covariance matrix is one simple and useful math concept that is widely applied in financial engineering, econometrics as well as machine learning. Given its practicality, I decided to summarize some key points and examples from my notepad and put together a cohesive story.

Photo from Pixabay

Covariance measures how much two random variables vary together in a population. When the population contains higher dimensions or more random variables, a matrix is used to describe the relationship between different dimensions. In a more easy-to-understand way, covariance matrix is to define the relationship in the entire dimensions as the relationships between every two random variables.

Use Case 1: Stochastic Modeling

The most important feature of covariance matrix is that it is positive semi-definite, which brings about Cholesky decomposition.

In a nutshell, Cholesky decomposition is to decompose a positive definite matrix into the product of a lower triangular matrix and its transpose. In practice, people use it to generate correlated random variables by multiplying the lower triangular from decomposing covariance matrix by standard normals. Moreover, matrix decomposition is helpful in many ways, as characterizing the matrix using hidden factors uncovers properties that are universal, and not very often we can perform matrix computation explicitly.

In financial engineering, Monte Carlo simulation plays a big role in option pricing where the payoff of the derivative is dependent on a basket of underlying assets. Given the standard formula for stock price evolution which assumes stock price follows geometric Brownian motion, the correlated stock price can be calculated by applying Cholesky decomposition to the covariance matrix.

Below is a simple example I made up in python for simulating correlated stock price paths using the approach.

import numpy as np
import matplotlib.pyplot as plt
mu_a, mu_b = 0.2, 0.3 # annual expected return for stock A and stock B
sig_a, sig_b = 0.25, 0.35 # annual expected volatility
s0_a, s0_b = 60, 55 # stock price at t0
T = 1 # simulate price evolution for the next year
delta_t = 0.001
steps = T/delta_t
rho = 0.2 # correlation between stock A and stock B
cor_matrix = np.array([[1.0, rho],
[rho, 1.0]])
sd = np.diag([sig_a, sig_b])
cov_matrix = np.dot(sd, np.dot(cor_matrix, sd))
L = np.linalg.cholesky(cov_matrix) # Cholesky decomposition
plt.figure(figsize = (12, 6))
path_a = [s0_a]
path_b = [s0_b]
st_a, st_b = s0_a, s0_b
for i in range(int(steps)):
V = L.dot(np.random.normal(0, 1, 2))
st_a = st_a*np.exp((mu_a - 0.5*sig_a**2)*delta_t + sig_a*np.sqrt(delta_t)*V[0])
st_b = st_b*np.exp((mu_b - 0.5*sig_b**2)*delta_t + sig_b*np.sqrt(delta_t)*V[1])
path_a.append(st_a)
path_b.append(st_b)
plt.plot(path_a, label = 'stock A', linewidth = 2)
plt.plot(path_b, label = 'stock B', linewidth = 2)
plt.legend()
plt.title('Correlated Stock Movement Using Monte Carlo Simulation')
plt.ylabel('stock price')
plt.xlabel('steps')

Use Case 2: Principal Component Analysis

PCA is an unsupervised linear dimensionality reduction algorithm to transform the original variables to the linear combination of these variables which are independent. It projects the entire datasets onto a different feature space where it can prioritize dimensions that explain the most variance of the data. Machine learning practitioners leverage PCA to reduce computational complexity by dropping low variance dimensions, as well as to create better visualization.

How does PCA connect to covariance variance? Eigen Decomposition

Just like Cholesky decomposition, eigendecomposition is a more intuitive way of matrix factorization by representing the matrix using its eigenvectors and eigenvalues. An eigenvector is defined as a vector that only changes by a scalar when a linear transformation is applied to it. If A is the matrix that represents the linear transformation, v is an eigenvector and λ is the corresponding eigenvalue. It can be expressed as Avv. A square matrix can have as many eigenvectors as it has dimensions. If putting all eigenvectors as columns of a matrix V and corresponding eigenvalues as entries of a diagonal matrix L, the above equation can be extended to AV = VL. In the case of covariance matrix, all the eigenvectors are orthogonal to each other, which are the principal components for the new feature space. “By ranking your eigenvectors in order of their eigenvalues, highest to lowest, you get the principal components in order of significance.”

References:

  1. https://skymind.ai/wiki/eigenvector#code
  2. https://blog.csdn.net/thesnowboy_2/article/details/69564226
  3. https://datascienceplus.com/understanding-the-covariance-matrix/

--

--

Data scientist, this is my notepad for math topics and a journey of self-growth, “You are not your past.”