Breaking Down Richard Sutton’s Policy Gradient With PyTorch And Lunar Lander

Published in

Towards Data Science

8 min readOct 16, 2019

Lunar Lander Artwork from https://70sscifiart.tumblr.com/post/157053475728/1969-nasa-concept-art-of-the-apollo-11-lunar

In the early 2000s, a few papers have been published about the policy gradient methods (in one form or another) in reinforcement learning. Most notable of all was “Policy Gradient Methods for Reinforcement Learning with Function Approximation” by Richard Sutton et al.

Breaking Down Richard Sutton’s Policy Gradient With PyTorch And Lunar Lander

Written by Stepan Ulyanin