Reinforcement Learning — TD(λ) Introduction(3)

Extend TD(λ) on Q function with Sarsa(λ)

Published in

Towards Data Science

6 min readSep 14, 2019

In last posts, we have learnt the idea of TD(λ) with eligibility trace, which is a combination of n-step TD method, and have applied it on random walk example. In this post, let’s extend the idea of lambda to more general use cases — instead of learning a state-value function, a Q function of state, action value will be learnt. In this article, we will:

Learn the idea of Sarsa(λ)
Apply it on mountain car example

Reinforcement Learning — TD(λ) Introduction(3)

Extend TD(λ) on Q function with Sarsa(λ)

Written by Jeremy Zhang