Member-only story

Thoughts and Theory

The Dying ReLU Problem, Clearly Explained

Keep your neural network alive by understanding the downsides of ReLU

Kenneth Leung

Published in

Towards Data Science

6 min readMar 30, 2021

(1) What is ReLU and what are its advantages?
(2) What’s the Dying ReLU problem?
(3) What causes the Dying ReLU problem?
(4) How to solve the Dying ReLU problem?

Activation functions are mathematical equations that define how the weighted sum of the input of a neural node is transformed into an output, and they are key parts of an artificial neural network (ANN) architecture.

Activation functions add non-linearity to a neural network, allowing the network to learn complex patterns in the data. The choice of activation function has a significant impact on an ANN’s performance, and one of the most popular choices is the Rectified Linear Unit (ReLU).

What is ReLU, and what are its advantages?

The Rectified Linear Unit (ReLU) activation function can be described as:

f(x) = max(0, x)

What it does is:
(i) For negative input values, output = 0
(ii) For positive input values, output = original input value

Thoughts and Theory

The Dying ReLU Problem, Clearly Explained

Keep your neural network alive by understanding the downsides of ReLU

Contents

What is ReLU, and what are its advantages?

f(x) = max(0, x)

Create an account to read the full story.

Published in Towards Data Science

Written by Kenneth Leung

Responses (2)