Intuitive understanding of Eigenvectors: Key to PCA

Najeeb Qazi
Towards Data Science
4 min readMay 9, 2020

--

Truly understanding Principal Component Analysis (PCA) requires a clear understanding of the concepts behind linear algebra, especially Eigenvectors. There are many articles out there explaining PCA and its importance, though I found a handful explaining the intuition behind Eigenvectors in the light of PCA.

This article aims to give a visual and intuitive understanding of Eigenvectors, such that it makes you better equipped to understand PCA. The steps of the PCA algorithm are not the main focus of this article.

First, I would like to start off by visualizing some basic concepts underlying linear algebra.

Linear Transformations

Matrices are said to act as functions or ‘transformations’ in the linear space- which means, after applying a transformation to a vector in the linear space, the linearity is maintained. Refer to the example below -

In Figure, we see a vector a= [2 2] is transformed into vector [4 2], after multiplying by matrix B.

Fig 1 (left) Shows vector ‘a’ before transformation, and Fig 2 shows the vector ‘a’ after it is ‘transformed’ by matrix B

Change of Basis

In the standard coordinate system, unit vectors i and j are considered the ‘basis’ vectors. These standard basis vectors become a way of measuring our system.

But, what if we want to change the basis vectors since our data might look better in a different system?

For e.g.- If we have some data points in a two-dimensional space.

Scattered data in a two dimensional plane

This view tells us something about data, but we can get a clearer view if we rotate the axes.

The new axes are centered at the mean of the data. This rotation makes it easier for us to measure the spread or the variance of the data. It also clearly shows that there are two different clusters of data present.

So, how do we go about changing basis? — Linear transformation.

Matrix multiplication is nothing but changing the basis of the current coordinate system, to the new ones defined by the matrix. The column vectors of the matrix give the position of the new basis vectors.

Taking the earlier example of a linear transformation, where we multiplied vector a by matrix B, we are finding the position of vector a in the new coordinate system spanned by the basis vectors of matrix B. Refer to figures below.

Fig 1 (left) shows the vector a in the standard coordinate system. Fig 2 (right) shows vector a placed in the new coordinate system given by matrix B

If we want to ‘go back’ to the original coordinate system, we just have to multiply the ‘new vector’ with the inverse of the change of basis matrix B.

Therefore, multiplying vector [4 2] by inverse of B, would give us vector [2 2].

Eigenvectors and Eigenvalues

How do Eigenvectors and Eigenvalues fit into all of this?

During a linear transformation, there may exist some vectors that remain on their original span, and are only scaled or shrunk. In other words, their direction remains unchanged. Mathematically, it is expressed as —

Expression for Eigenvector x given by a Transformation A

λ is the eigenvalue associated with the eigenvector x, and the matrix A is known as the transformation applied on vector x.

Eigenvector expressed as a Transformation function

Geometrically speaking, we can visualize it in the following way

Transformation on vector x results in stretching it by a factor of 2 (notice there is no change in direction or span)

Here, the transformation on vector x stretches it to twice its length. Hence the eigenvalue associated with this transformation is 2. Negative eigenvalues are associated with flipping the vector or reversing the direction of the vector.

What makes them so useful?

A coordinate system given by eigenvectors is known as an eigenbasis, it can be written as a diagonal matrix since it scales each basis vector by a certain value.

Diagonal Matrix with N eigenvectors

Diagonal matrices make calculations really easy. Consider raising a matrix to a power of 100, it becomes an arduous task in case of a non-diagonal matrix. In the case of a diagonal matrix, the calculations are fairly simple.

Diagonal matrix makes calculations easier

In light of PCA

The goal of PCA is to minimize redundancy and maximize variance to better express the data. It does so by finding the eigenvectors associated with the covariance matrix of the data points. The data is then projected onto the new coordinate system spanned by these eigenvectors. To read further on PCA, check out References [1] and [2].

I hope this served as an intuitive understanding of Eigenvectors and it helps you better understand the PCA algorithm.

References-

All diagrams are made with -https://www.desmos.com/

  1. PCA tutorial — https://arxiv.org/pdf/1404.1100.pdf
  2. http://www.math.union.edu/~jaureguj/PCA.pdf
  3. https://www.khanacademy.org/math/linear-algebra/
  4. Brilliant series on linear algebra by 3blue1brown — https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=1

--

--

Pursuing Master’s in Computer Science from Simon Fraser University | Data Scientist Intern @ Statistics Canada