Why Reinforcement Learning Doesn’t Need Bellman’s Equation

A re-assessment of the famous Bellman equation in Reinforcement Learning and MDP formulations

Published in

Towards Data Science

9 min readFeb 20, 2022

Richard Bellman founded dynamic programming and his famous recursive equation while working at RAND Corporation, based in Santa Monica. Photo by Sung Shin on Unsplash

In academic circles, it’s often boilerplate to pair a Reinforcement Learning (RL) algorithm with a Markov Decision Process (MDP) formulation and the famed Bellman equation. At first sight, this makes sense, as…

Why Reinforcement Learning Doesn’t Need Bellman’s Equation

A re-assessment of the famous Bellman equation in Reinforcement Learning and MDP formulations

Written by Wouter van Heeswijk, PhD