Hands-on Tutorials
Animations of Gradient Descent and Loss Landscapes of Neural Networks in Python
During my journey of learning various machine learning algorithms, I came across loss landscapes of neural networks with their mountainous terrains, ridges and valleys. These loss landscapes were looking very different from the convex and smooth loss landscapes I encountered with linear and logistic regression. In the following article, we will create loss landscapes of neural networks and animate gradient descent using the MNIST-dataset.
The above image exemplarily depicts the highly non-convex loss landscape of a neural network. A loss landscape is a visual representation of values a cost function takes on for a given range of parameter values given our training data. Since our objective is to visualize the costs in three dimensions, we need to choose two particular parameters to be varied for our plots, while all other model parameters are kept fixed. It is worth mentioning, however, that there are more advanced techniques (e.g. dimensionality reduction, filter normalization) that can be used to approximate neural network loss landscapes in a low-dimensional parameter subspace.¹ A three-dimensional representation of the loss landscape of a VGG neural network with 56 layers is displayed in figure 1. However, this is beyond the scope of this article.