Examples#
This page is work in progress, and contains an overview of the documentation for selected models and environments used in the course.
- The Pacman Game
- Control models
- Week 1: The Pacman game
- Week 1: The Inventory-control game
- Week 2: Optimal planning in the Inventory-environment
- Week 2: Optimal planning with Pacman
- Week 3: Frozen lake and dynamical programming
- Week 3: Harmonic Oscillator
- Week 3: Pendulum with random actions
- Week 4: PID Control
- Week 8: Simple bandit
- Week 8: UCB bandit algorithm
- Week 9: Policy evaluation
- Week 9: Policy iteration
- Week 9: Value iteration
- Week 10: MC Control
- Week 10: TD-learning
- Week 10: MC value estimation
- Week 11: Sarsa
- Week 11: Q-learning
- Week 11: N-step sarsa
- Week 11: Mountain-car with linear feature approximators