Why Reinforcement Learning Doesn’t Need Bellman’s Equation

A re-assessment of the famous Bellman equation in Reinforcement Learning and MDP formulations

Wouter van Heeswijk, PhD
Towards Data Science
9 min readFeb 20, 2022

--

Richard Bellman founded dynamic programming and his famous recursive equation while working at RAND Corporation, based in Santa Monica. Photo by Sung Shin on Unsplash

In academic circles, it’s often boilerplate to pair a Reinforcement Learning (RL) algorithm with a Markov Decision Process (MDP) formulation and the famed Bellman equation. At first sight, this makes sense, as…

--

--

Assistant professor in Financial Engineering and Operations Research. Writing about reinforcement learning, optimization problems, and data science.