Member-only story

Thoughts and Theory

The Dying ReLU Problem, Clearly Explained

Keep your neural network alive by understanding the downsides of ReLU

Kenneth Leung
Towards Data Science
6 min readMar 30, 2021

--

Photo by Solen Feyissa on Unsplash

Contents

(1) What is ReLU and what are its advantages?
(
2) What’s the Dying ReLU problem?
(
3) What causes the Dying ReLU problem?
(
4) How to solve the Dying ReLU problem?

Activation functions are mathematical equations that define how the weighted sum of the input of a neural node is transformed into an output, and they are key parts of an artificial neural network (ANN) architecture.

Activation functions add non-linearity to a neural network, allowing the network to learn complex patterns in the data. The choice of activation function has a significant impact on an ANN’s performance, and one of the most popular choices is the Rectified Linear Unit (ReLU).

What is ReLU, and what are its advantages?

The Rectified Linear Unit (ReLU) activation function can be described as:

f(x) = max(0, x)

What it does is:
(i) For negative input values, output = 0
(ii) For positive input values, output = original input value

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Kenneth Leung
Kenneth Leung

Written by Kenneth Leung

Senior Data Scientist at Boston Consulting Group | Top Tech Author | 2M+ reads on Medium | linkedin.com/in/kennethleungty | github.com/kennethleungty

Responses (2)