Reinforcement Learning — TD(λ) Introduction(3)

Extend TD(λ) on Q function with Sarsa(λ)

Jeremy Zhang
Towards Data Science
6 min readSep 14, 2019

--

In last posts, we have learnt the idea of TD(λ) with eligibility trace, which is a combination of n-step TD method, and have applied it on random walk example. In this post, let’s extend the idea of lambda to more general use cases — instead of learning a state-value function, a Q function of state, action value will be learnt. In this article, we will:

  1. Learn the idea of Sarsa(λ)
  2. Apply it on mountain car example

--

--