Why ‘Learn To Forget’ in Recurrent Neural Networks

Illustrated with a simple example

Arun Jagota
Towards Data Science
7 min readJan 2, 2021

--

Photo by Nicole Rodriguez on Unsplash

Consider the following binary classification problem. The input is a binary sequence of arbitrary length. We want the output to be 1 if and only if a 1 occurred in the input but not too recently. Specifically, the last n bits must be 0.

--

--

PhD, Computer Science, neural nets. 14+ years in industry: data science algos developer. 24+ patents issued. 50 academic pubs. Blogs on ML/data science topics.