A Simple Guide to Gradient Descent
Illustrating the algorithm in a multiple linear regression example in Python
When studying a machine learning book, it is very likely that one will encounter the notorious gradient descent just within the very first pages. While the idea behind this algorithm requires a bit of mathematical intuition, it is incredibly impressive how useful and versatile the application of gradient descent can be. This article equips you with all the hands-on knowledge you need to know.
No matter if you dig deeper into deep learning (backward propagation) or just have an interest in how the coefficients in linear regression (ordinary least squares) can be derived, the gradient descent (GD) is an integral part of these methodologies and should not remain a black-box model to the user. This explanation aims at linking a few simple mathematical expressions with the related code.
This article post introduces a straightforward four-step algorithm to implement gradient descent. All you need!
Why Exploring Gradient Descent?
GD is probably easiest to grasp when we put the algorithm directly in the context of linear regression — I will therefore use regression as main reference.