The world’s leading publication for data science, AI, and ML professionals.

Intuitive Understanding of Randomized Singular Value Decomposition

A Python Implementation of SVD with Randomized Linear Algebra

Matrix decomposition is a powerful tool for many machine learning problems and which has been widely used in data compression, dimensionality reduction, and sparsity learning, to name but a few. In many cases, for purposes of approximating a data matrix by a low-rank structure, Singular Value Decomposition (SVD) is often verified as the best choice. However, the accurate and efficient SVD of large data matrices (e.g., 8k-by-10k matrix) is computationally challenging. To resolve the SVD in this situation, there are many algorithms have been developed by applying randomized linear algebra methods. One of the most important algorithms is randomized SVD, which is competitively efficient for decomposing any large matrix with a relatively low rank.

Figure 1: A timeline of major SVD developments. (The picture is from [2])
Figure 1: A timeline of major SVD developments. (The picture is from [2])

This post will introduce the preliminary and essential idea of the randomized SVD. To help readers gain a better understanding of randomized SVD, we also provide the corresponding Python implementation in this post. In addition, Jupyter notebook of this post can be found here.


Preliminary

SVD Formula

We start by recalling the concept of SVD. As you may already know, SVD is one of the most important decomposition formula in Linear Algebra. For any given matrix A, SVD has the form of

A = _U_ΣV^T

where the matrices U and V consist of left and right singular vectors, respectively. The diagonal entries of Σ are singular values.

A Small Matrix Example

Take a 3-by-3 matrix for example, we can compute the SVD by using numpy.linalg.svd() in Python. Let us have a look:

import numpy as np
A = np.array([[1, 3, 2],
              [5, 3, 1],
              [3, 4, 5]])
u, s, v = np.linalg.svd(A, full_matrices = 0)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()

In this case, the singular values are 9.3427, 3.2450, and 1.0885.

Left singular vectors:
[[-0.37421754 0.28475648 -0.88253894]
 [-0.56470638 -0.82485997 -0.02669705]
 [-0.7355732 0.48838486 0.46948087]]
Singular values:
[9.34265841 3.24497827 1.08850813]
Right singular vectors:
[[-0.57847229 -0.61642675 -0.53421706]
 [-0.73171177 0.10269066 0.67383419]
 [ 0.36051032 -0.78068732 0.51045041]]

Randomized SVD

Essential Idea

Randomized SVD can be broken into three main steps. For any given m-by-n matrix A, if we impose a target rank k with k < min(m, n), then the first step as shown in Figure 2 is to

  • 1) generate a Gaussian random matrix Ω with size of _n-by-k_,
  • 2) compute a new m-by-k matrix Y,
  • and 3) apply QR decomposition to the matrix Y.

Note that the first step needs to return the m-by-k matrix Q.

Figure 2: The first step of randomized SVD. (The picture is from [2])
Figure 2: The first step of randomized SVD. (The picture is from [2])

Then, the second step as shown in Figure 3 is to

  • 4) derive a k-by-n matrix B by multiplying the transposed matrix of Q and the matrix A together,
  • and 5) compute the SVD of the matrix B. Here, instead of computing the SVD of the original matrix A, B is a smaller matrix to work with.

For the fact that the singular values (i.e., Σ)and right singular vectors (i.e., V) of the matrix B are also the singular values and right singular vectors of the original matrix A, we should preserve the singular values and right singular vectors computed by the matrix B in this step.

Figure 3: The second and third steps of randomized SVD. (The picture is from [2])
Figure 3: The second and third steps of randomized SVD. (The picture is from [2])

As shown in Figure 3, if we combine the matrix Q derived in the first step with the left singular vectors of B, we can get the left singular vectors (i.e., U) of the matrix A in the third step.

A Small Matrix Example

Even though we have learned the essential idea of randomized SVD in above, it would not be really clear if there is no intuitive example. To this end, we follow the aforementioned small matrix SVD.

First, let us try to write the Python function of randomized SVD. Here, we will use two Numpy functions, i.e., np.linalg.qr() and np.linalg.svd() .

import numpy as np
def rsvd(A, Omega):
    Y = A @ Omega
    Q, _ = np.linalg.qr(Y)
    B = Q.T @ A
    u_tilde, s, v = np.linalg.svd(B, full_matrices = 0)
    u = Q @ u_tilde
    return u, s, v

Now, let us test it with 3-by-3 matrix (rank = 2 for indicating k with k < min(m, n)):

np.random.seed(1000)
A = np.array([[1, 3, 2],
              [5, 3, 1],
              [3, 4, 5]])
rank = 2
Omega = np.random.randn(A.shape[1], rank)
u, s, v = rsvd(A, Omega)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()

The result of this randomized SVD example is:

Left singular vectors:
[[ 0.38070859  0.60505354]
 [ 0.56830191 -0.74963644]
 [ 0.72944767  0.26824507]]
Singular values:
[9.34224023 3.02039888]
Right singular vectors:
[[ 0.57915029  0.61707064  0.53273704]
 [-0.77420021  0.21163814  0.59650929]]

Recall that the singular values of this matrix are 9.3427, 3.2450, and 1.0885. In this case, randomized SVD has the first two singular values as 9.3422 and 3.0204.

We can see that the first singular values computed by these two SVD algorithms are extremely close. However, the second singular value of randomized SVD has a slight bias. Is there any other method to improve this result? And how?

The answer is yes!

Randomized SVD with Power Iteration

To improve the quality of randomized SVD, power iteration method can be used directly. For more detail about power iteration, please see the page 39 in [1] and there is also a Matlab implementation in the page 40.

In the following Python codes, power_iteration() is the function for computing the m-by-k matrix Y iteratively (the default power_iter is 3) and then derive the m-by-k matrix Q by QR decomposition.

import numpy as np
def power_iteration(A, Omega, power_iter = 3):
    Y = A @ Omega
    for q in range(power_iter):
        Y = A @ (A.T @ Y)
    Q, _ = np.linalg.qr(Y)
    return Q
def rsvd(A, Omega):
    Q = power_iteration(A, Omega)
    B = Q.T @ A
    u_tilde, s, v = np.linalg.svd(B, full_matrices = 0)
    u = Q @ u_tilde
    return u, s, v

Let us test our new rsvd() function:

np.random.seed(1000)
A = np.array([[1, 3, 2],
              [5, 3, 1],
              [3, 4, 5]])
rank = 2
Omega = np.random.randn(A.shape[1], rank)
u, s, v = rsvd(A, Omega)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()

The result is:

Left singular vectors:
[[ 0.37421757  0.28528579]
 [ 0.56470638 -0.82484381]
 [ 0.73557319  0.48810317]]
Singular values:
[9.34265841 3.24497775]
Right singular vectors:
[[ 0.57847229  0.61642675  0.53421706]
 [-0.73178429  0.10284774  0.67373147]]

Recall that:

  • Singular values of SVD are: 9.3427, 3.2450, and 1.0885.
  • Singular values of randomized SVD without power iteration are: 9.3422 and 3.0204.
  • Singular values of randomized SVD with power iteration are: 9.3427 and 3.2450.

As you can see, the randomized SVD with power iteration provides extremely accurate singular values.

Image Compression

As mentioned above, it is possible to compress (low-rank) signal matrix using the SVD or randomized SVD. In fact, the way to compress an image using the SVD is rather simple: taking the SVD of the image directly and only keeping the dominant singular values and left/right singular vectors. In terms of randomized SVD, we can predefine the number of dominant singular values first, and then obtain the singular values and left/right singular vectors by the randomized SVD.

For our evaluation, we choose the color image of Lena as our data. The size of this image is 256×256×3. Here, we build a matrix A of size 256×256 by only selecting the green chanel.

  • Using SVD directly
  • Using randomized SVD instead

We can see from this image compression experiment that:

  • 1) By comparing to the SVD, the randomized SVD can also produce accurate compression with a prescribed low rank (here, we set rank = 50).
  • 2) The randomized SVD is computational friendly. By specifying rank = 50, the total CPU times of the randomized SVD is about 11.6 ms, while the total CPU times of the SVD is 31.5 ms.

Summary

In this post, you discovered the randomized linear algebra method for SVD. Specifically, you learned:

  • The essential idea of randomized SVD.
  • How to implement randomized SVD in Python.

Do you have any questions?

References

[1] Steven L. Brunton, J. Nathan Kutz (2019). Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Page 37–41.

[2] N. Benjamin Erichson, Sergey Voronin, Steven L. Brunton, J. Nathan Kutz (2016). Randomized Matrix Decompositions Using R. arXiv:1608.02148. [PDF]


Related Articles