Breaking Down Richard Sutton’s Policy Gradient With PyTorch And Lunar Lander

Stepan Ulyanin
Towards Data Science
8 min readOct 16, 2019

--

Lunar Lander Artwork from https://70sscifiart.tumblr.com/post/157053475728/1969-nasa-concept-art-of-the-apollo-11-lunar

In the early 2000s, a few papers have been published about the policy gradient methods (in one form or another) in reinforcement learning. Most notable of all was “Policy Gradient Methods for Reinforcement Learning with Function Approximation” by Richard Sutton et al.

--

--