Policy gradient derivation (part 1)
Policy gradient derivation (part 2)
Policy gradient derivation (part 3)
From policy gradient with baseline to Actor-Critic
Note about the videos: The videos are quite outdated, the slides have been a lot reorganized with respect to the videos
https://master-dac.isir.upmc.fr/rld/rl/05-reinforce.student.ipynb