RL: thoughts, content for self-teaching and more

Policy Gradient and Reinforce

Slides

The policy search problem

Policy Gradient Theorem

Policy Gradient Algorithms

Being Actor-Critic

Videos

The policy search problem

Policy gradient derivation (part 1)

Policy gradient derivation (part 2)

Policy gradient derivation (part 3)

From policy gradient with baseline to Actor-Critic

Note about the videos: The videos are quite outdated, the slides have been a lot reorganized with respect to the videos

Labs

https://master-dac.isir.upmc.fr/rld/rl/05-reinforce.student.ipynb

Additional material