Hands-on Tutorials

Animations of Gradient Descent and Loss Landscapes of Neural Networks in Python

Published in

Towards Data Science

9 min readDec 21, 2020

During my journey of learning various machine learning algorithms, I came across loss landscapes of neural networks with their mountainous terrains, ridges and valleys. These loss landscapes were looking very different from the convex and smooth loss landscapes I encountered with linear and logistic regression. In the following article, we will create loss landscapes of neural networks and animate gradient descent using the MNIST-dataset.

**Fig. 1**: Loss landscape of a convolutional neural network with 56 layers (VGG-56, source¹)

The above image exemplarily depicts the highly non-convex loss landscape of a neural network. A loss landscape is a visual representation of values a cost function takes on for a given range of parameter values given our training data. Since our objective is to visualize the costs in three dimensions, we need to choose two particular parameters to be varied for our plots, while all other model parameters are kept fixed. It is worth mentioning, however, that there are more advanced techniques (e.g. dimensionality reduction, filter normalization) that can be used to approximate neural network loss landscapes in a low-dimensional parameter subspace.¹ A three-dimensional representation of the loss landscape of a VGG neural network with 56 layers is displayed in figure 1. However, this is beyond the scope of this article.

Hands-on Tutorials

Animations of Gradient Descent and Loss Landscapes of Neural Networks in Python

Create an account to read the full story.

Published in Towards Data Science

Written by Tobias Roeschl

Responses (5)

More from Tobias Roeschl and Towards Data Science

Pearson’s chi-squared test from scratch with Python

After having discussed Fisher‘s exact test and its implementation with Python in my last article, I now want to dedicate myself to another…

Think Correlation Isn’t Causation? Meet Partial Correlation

Despite being so powerful, partial correlation is perhaps the most underrated tool in data science

Measuring Quantum Noise in IBM Quantum Computers

A discussion around measuring error rates in IBM quantum processors, with code examples, using Qiskit

Fisher’s exact test from scratch with Python

In this article I want to give a brief review of Fisher’s exact test and then continue to show its implementation from scratch with Python…

Recommended from Medium

Understanding the Forward and Backward Pass in PyTorch: A Step-by-Step Walkthrough

In this article, we’ll dive into the fundamental concepts of the forward and backward pass using PyTorch, demonstrating how gradient…

The Math Behind Recurrent Neural Networks

Dive into RNNs, the backbone of time series, understand their mathematics, implement them from scratch, and explore their applications

Lists

Predictive Modeling w/ Python

Practical Guides to Machine Learning

Coding & Development

Natural Language Processing

Constructing a Multilayer Perceptron (MLP) from Scratch in Python

We’ll dive into the implementation of a basic neural network in Python, without using any high-level libraries like TensorFlow or PyTorch…

Fourier Analysis Networks (FANs) Are Here To Break Barriers In AI

A deep dive into Fourier Analysis Networks (FANs), a novel neural network architecture and learning to build one from scratch.

10 Must-Know Machine Learning Algorithms for Data Scientists

Machine learning is the science of getting computers to act without being explicitly programmed.” — Andrew Ng

Natural Language Processing Series Part 3 : A Beginner’s Guide to Understanding RNNs

A Beginner’s Guide to Understanding RNNs and Their Problems, Like Exploding and Vanishing Gradients