Exercise 12: Eligibility traces and value-function approximations#
Note
The exercises material is divided into general information (found on this page) and the actual exercise instructions. You can download this weeks exercise instructions from here:
You are encouraged to prepare the homework problems 1 (indicated by a hand in the PDF file) at home and present your solution during the exercise session.
To get the newest version of the course material, please see Making sure your files are up to date
The main exercise today will be the tabular version of the \(\textrm{TD}(\lambda)\) algorithm described at http://incompleteideas.net/book/first/ebook/node77.html. The algorithm will be described in Todays lectures before the version which uses function approximators.
Solutions to selected exercises#
Problem 12.1: Tabular Sarsa(Lambda)
Problem 12.2: Tabular Sarsa(Lambda) and the open gridworld environment
Problem 12.3: Semi-gradient Sarsa(Lambda)