The world’s leading publication for data science, AI, and ML professionals.

Must-Read Book for Data Science

Boost up your knowledge and career with this gem

Picture from Unsplash
Picture from Unsplash

Introduction

Throughout this post, we will cover a review of one of the best books you should add to your repertoire to increase your skillset and boost up your Data Science career.

As you already know (and I previously discussed in other posts), Data Science is a passionate and multidisciplinary field that involves some of the most demanding sciences out there: from Mathematics and Statistics to Computer Science and Programming, ending with Machine Learning and Data Analysis and Visualization.

Picture from Unsplash
Picture from Unsplash

The path of the data scientist crosses complex terrains, and new challenges and opportunities arise at every turn. That’s why it’s crucial to keep up with the times and keep your sword sharp. That is, to carry out a constant study and learning, both of new papers and libraries that are emerging, as a constant review of the foundations.

I cannot emphasize enough this last point: fundamentals are the most important thing for any Data Scientist to master. On them is built all the knowledge that allows the training and deployment of models capable of predicting with incredible accuracy. But, once there is a gap in those fundamentals, we risk failing miserably.

As the famous phrase says:

"A chain is only as strong as its weakest link".

That is why I firmly believe that we must focus on strengthening as much as possible these fundamental links.

Picture from Unsplash
Picture from Unsplash

And if there is one discipline that is the fundamental basis of absolutely all the components that make up data science, it is mathematics. And this will be the topic we will focus on today.

I know I will lose most of the readers at this point. Everybody hates mathematics and run away from it at the slightest chance. But, I must say to you that if you are going to pursue a career in Data Science, and you want to be successful in it, you must definitely have a good mathematics base-knowledge.

Mathematics is the single foundation upon every component and discipline that conforms to Data Science: Statistics and Probability, Computer Science, Machine Learning, Deep Learning… All of them are applied mathematics and you should have a level decent enough to understand how all these disciplines come from. Besides, the reason why Data Science and big data has exploded in the past recent years is mainly because of the drastic increase in computation power, or in other words, the capability to perform complex and long calculations in less time.

Machine learning is a direct and obvious motivation as well as a clear example of applied mathematics.

Having this in mind, I have brought you today a review of a great book that has made a significant impact on my career as a data scientist and my life, and which I cannot but recommend.

Mathematics for Machine Learning

Mathematics for Machine Learning: from Amazon
Mathematics for Machine Learning: from Amazon

Heads up: This article contains affiliate links so that you can comfortably buy this book without any extra charge while contributing to the creation of more posts like this one.

This great book was written by Marc Peter Deisenroth, A.Aldo Faisal, and Chen Soon Ong and published on the 23rd of April 2020 and it will lay you everything you need to successfully understand and apply most of Machine Learning and Deep Learning algorithms, optimization mechanisms, cost functions… in sort all you will use throughout your Data Science Career.

As software and programming libraries improve, most Machine learning practitioners aren’t aware of the low-level technical details of the algorithms. Hence, there is a danger that a practitioner becomes unaware of the design decisions and, hence, the limits of machine learning algorithms.

The book is meant to be a Guidebook in the mathematics that forms the foundation of Machine Learning. They propose an understanding of Machine Learning as a science that is built upon as follows:

Figure by the Author
Figure by the Author

And so, the book is divided into two parts:

Part 1: Mathematical Foundations

In this part, the focus will be to lay a solid mathematical foundation to build up Machine Learning concepts later on. The specific topics covered are:

Linear Algebra: In Machine Learning, data will be represented as vectors. And so, linear algebra will be the study of those vectors and matrices.

Analytic Geometry: That has the study of similarity between vectors as a central theme.

Matrix Decomposition: To analyze operations on matrices that are extremely useful in Machine Learning as they enable the Data Scientist to build an intuitive representation of the data, its transformations, and how to perform efficient learning.

Vector Calculus: That will allow the understanding of the optimization techniques that are used to find the parameters that will maximize (or minimize) some performance measure, such as Gradient Descent.

Part 2: Machine Learning

The second part will focus on the four pillars of Machine Learning after defining the three components of machine learning (data, models, and parameter estimation) mathematically. These four pillars are:

Linear Regression: where the goal will be to find the function that will map the input to a corresponding target value, which will typically be a real number. Topics covered are model fitting by parameter estimation (linear regression) and by parameter integration (Bayesian regression).

Dimensionality Reduction: using principal component analysis, the goal will be to find a lower-dimensional representation of the input data. This will allow an easier analysis. It is important to notice that in these methods there arent target values. Dimensionality reduction belongs to the set of techniques of the so-called Unsupervised Learning.

Density Estimation: The objective will be to find a probability distribution that describes the input data. The focus will be on Gaussian Mixture models to do this and it also belongs to Unsupervised Learning.

Classification: Similarly to regression, classification also belongs to Supervised Learning, and it is studied through the lens of Support Vector Machines. Unlike regression, target values are typically integers, instead of real values.

Conclusions and Final Words

If I had to sum up this book in one phrase, it would look something like:

"The best study investment of the past year."

I personally believe that the best value that brings this book is that it links greatly the mathematical concepts explained in the first part of the book to the Machine Learning algorithms detailed in the second part. If you ever struggled understanding concepts like Gradient Descent, you won’t need to worry anymore about it after studying Mathematics for Machine Learning.

I am not going to lie, this is a dense and detailed book. You will have to invest time and effort to go through it and deeply understand its topics. But I assure you that it will be worth it. Do not give up, take your time, and make sure that you internalize its lessons. It will definitely pay up greatly in your understanding and application of Machine Learning.

After reading/studying it for the first time, it is a great tool to have at the side and I encourage you to come back to it and refresh the related concepts every time that you face a Machine Learning challenge. It will give you a great perspective on how to tackle blocking points and definitely ease up your path in the mid-long term.

Like always, I hope that you have enjoyed the post and that you will give a try to this amazing book. You can find it on the following link:

If you liked this post then you can take a look at my other posts on Data Science and Machine Learning here.

If you want to learn more about Machine Learning, Data Science and Artificial Intelligence follow me on Medium and stay tuned for my next posts


Related Articles