The world’s leading publication for data science, AI, and ML professionals.

Linear Algebra Essentials with Numpy (part 2)

Learn the Essential Linear Algebra skills for Data Science – Part 2/2

Enter The Matrix - Compute Shaders make matrix multiplications real fast!
Photo by Markus Spiske temporausch.com from Pexels

A couple of days back I’ve published the first of the two parts in Linear Algebra series for Data Science. It covers vectors through 6 key ideas and terms, so I would strongly advise reading that part first if you haven’t already:

Linear Algebra Essentials with Numpy (part 1)

As the previous part covered whys and whats of the story (eg. why linear algebra), this part will jump straight to the point. The structure of the article is the same. Each of the topics is divided into 3 parts:

  1. Theoretical explanation
  2. Example (by hand calculations)
  3. Implementation in Python (with Numpy)

With regards to topic, here’s what I want to cover:

  1. Matrix Addition
  2. Scalar Multiplication
  3. Matrix Multiplication
  4. Transpose of a Matrix
  5. Identity Matrix
  6. Determinant
  7. Matrix Inverse

Yeah, I know, it will be a lot of work, once again. Nothing you can do about it – grab a piece of coffee (or scotch) and let’s dive right in!


What’s a Matrix?

According to Wikipedia:

A matrix is a rectangular array of numbers or other mathematical objects for which operations such as addition and multiplication are defined.[1]

As a data scientist, you are using matrices all the time, but you probably don’t know that (just yet). Any dataset you’ve used in the past can be thought of as a matrix – a rectangular array of numbers – rows and columns to be more specific.

You might be wondering how does matrix differ from a vector from a data scientists perspective. Simply put, a vector is a single column (attribute) in your dataset, while matrix is a collection of all columns.

Matrices are usually denoted with a capital bolded letter, like this for example:

As with vectors, it’s not difficult to grasp the key concepts.

Let’s dive into some examples.


1. Matrix Addition

Matrix addition (or subtraction) is really similar to the one you did with vectors earlier. The only difference is that there are multiple columns instead of just one.

The whole idea remains the same, you only need to add up the corresponding components. In the general formula, I’ve used a and b for placeholders, and you can see how each component is added up:

Although this is fairly simple to grasp, here’s a simple example of 2 matrix addition:

Matrix addition is really simple to implement in Numpy. Once again, as with vectors, you can use the addition sign:

A = np.matrix([
    [3, 5],
    [1, 0]
])
B = np.matrix([
    [2, -3],
    [1, 2]
])
print(A + B)

2. Scalar Multiplication

The concepts are more or less the same as with vectors. Every number in the matrix will be multiplied with some scalar n.

The formula is also very similar:

For the example, I’ve chosen to use an arbitrary matrix, and I’ve set the scalar n to 2:

Implementation in Python:

A = np.matrix([
    [3, 5],
    [1, 0]
])
print(2 * A)

Everything is pretty much identical as with vectors, right?

Hold that thought.


3. Matrix Multiplication

Here comes a topic that I would say is slightly more complex to grasp on then the others encountered so far. It isn’t as hard as it might seem at first, but you’ll need to solve a couple of examples to get the gist fully.

For the following examples in matrix multiplication section, two matrices are declared:

  1. Matrix A – has dimensions of m by n (m rows, n columns)
  2. Matrix B – has dimensions of n by p (n rows, p columns)

Multiplication of A and B will yield a new matrix that has dimensions of m by p (m rows by p columns). In plain English, the resulting matrix will have the number of rows that matrix A has, and a number of columns that matrix B has.

You will probably need to read the last paragraph a couple of times before you understand it fully, and that’s okay. To help you out, here everything I’ve said so far presented visually:

This helps, right?
This helps, right?

As you can see, two n‘s in the middle need to match. If they are not equal, matrix multiplication cannot be performed. Most programming languages will throw you an error on dimension mismatch.

Okay, now when you understand the basic rule of matrix multiplication, you are ready for the general formula:

To state in the most general way (please embed this to your brain):

Matrix multiplication is performed by calculating the dot product of the corresponding row of matrix A and the corresponding column of matrix B.

If you understand that sentence, you understand matrix multiplication. If not, let’s drive the point home with a simple example:

As with vectors, you can use the dot function to perform multiplication with Numpy:

A = np.matrix([
    [3, 4],
    [1, 0]
])
B = np.matrix([
    [2, 2],
    [1, 2]
])
print(A.dot(B))

Don’t worry if this was hard to grasp on after the first reading. Matrix multiplication was a hard concept for me to grasp on too, but what really helped is doing it on paper by hand. There are tons of examples online.

If you don’t want to look for examples, make up your own, and then use Numpy for verification – like a boss.


4. Matrix Transpose

And now something simple, to rest your brain for a minute. But just for a minute.

Matrix transpose is one of those topics that sounds super fancy, particularly if you’re not a native English speaker and you don’t know what ‘Transpose’ means.

The idea is really simple – you only need to exchange rows and columns of the matrix. Transpose operator is in most cases denoted with capital letter T, and notation can be put either before the matrix or as an exponent. Either way, here’s the general formula:

As you can see the diagonal elements stayed the same, and those off-diagonal switched their position.

Here’s a simple example with a 2×2 matrix:

Implementation in Python really can’t be any simpler:

A = np.matrix([
    [3, 4],
    [1, 0]
])
print(A.T)

5. Identity Matrix

Just as with transpose, Identity matrices are also really simple to grasp on. It is a matrix where:

  1. Every diagonal element is 1
  2. All the other elements are 0

And that’s it! It’s usually denoted with a capital letter I, and the number representing its size in a subscript.

Here’s how the identity matrix of size 3 would look like:

There won’t be any example for identity matrix (for now), I’ll just show you how to create them in Python:

A = np.eye(3)
print(A)

Now back to harder stuff (kind of).


6. Determinant

According to Wikipedia:

The determinant is a scalar value that can be computed from the elements of a square matrix and encodes certain properties of the linear transformation described by the matrix. The determinant of a matrix A is denoted det(A), det A, or |A|. Geometrically, it can be viewed as the volume scaling factor of the linear transformation described by the matrix.[2]

To develop a more intuitive sense of what the determinant is, and what it is used for, please refer to the video playlist linked down in the article conclusion section.

The calculation process is simple for 2×2 matrix, get’s a little more difficult for 3×3 matrices, and shouldn’t be computed by hand for larger ones. I mean you can if you want to, but why? The goal here is to develop the intuition, computers were made to do the calculations.

Here’s the general formula for calculating the determinant of 2×2 matrix:

And to drive a point home here’s the most basic example of calculation by hand:

Implementation in Python:

A = np.matrix([
    [3, 2],
    [1, 6]
])
print(np.linalg.det(A))

You are doing great. The last section follows, and then you are done!


7. Matrix Inverse

A square matrix is called invertible (or nonsingular) if multiplication of the original matrix by its inverse results in the identity matrix.

From that statement, you can conclude that not all matrices have inverses. For a matrix to be invertible, it has to satisfy the following conditions:

  • Must be square
  • The determinant cannot be 0

A matrix that isn’t invertible is called a singular matrix. Logically, for square matrix to be singular, its determinant must be equal to 0. Let’s see why by exploring the general formula:

As you can see the matrix inverse is denoted by this -1 term in the superscript. The formula might already look really familiar to you – there’s previously seen ad – bc term (the determinant). You can see here why the determinant cannot be 0 – division by 0 is undefined.

This term is then multiplied with the slightly rearranged version of the original matrix. The diagonal items are switched, and off-diagonal elements are multiplied by negative one (-1).

Here’s a simple example of calculating the inverse of the 2×2 matrix:

Implementation in Python:

A = np.matrix([
    [4, 3],
    [5, 4]
])
print(np.linalg.inv(A))

Now let’s verify the claim stated earlier, and that is that multiplication of the original matrix by its inverse yields the identity matrix:

Here’s the example calculated by hand, and the statements holds true!

Implementation in Python:

print(A.dot(np.linalg.inv(A)))

Conclusion

Take a moment to congratulate yourself on making it to the end. I hope you’ve read the first part of the article, and if you did, thank you.

Maybe not all of the discussed terms are directly applicable in data science (as of yet from your perspective), but linear algebra is in general worth knowing – it’s something that will probably be asked in your upcoming data science interviews, so knowing the basics is a must.

Now relax, watch a movie, grab a couple of beers and let everything sink in. After a week or so I would advise exploring linear algebra further on your own, and of course, make sure to watch this playlist:

I can stress how much it will help you in developing an intuitive approach to linear algebra.

Thanks for reading…


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


References

[1] https://en.wikipedia.org/wiki/Matrix_(mathematics)

[2] https://en.m.wikipedia.org/wiki/Determinant


Related Articles