Reinforcement Learning – Mastering the Gradient Bandit Algorithm

gradient descentmultiarmed-banditreinforcement learning

I read about the Gradient Bandit Algorithm as a possible solution to the Multi-armed Bandits, and I didn’t understand it.
I would be happy if anyone can send me a link to a video, blog post, book, lecture, and etc. that explain it in baby steps.
Thanks

Best Answer

Here you have a nice post explaining it step by step: https://www.datahubbs.com/multi-armed-bandits-reinforcement-learning-2/.

Also If you want to go deeper I would suggest to read Section 2.2 in the book by Cesa Bianchi and Bubeck https://arxiv.org/abs/1204.5721. It is a very good book and Bubeck is one of the living masters of Convex optimization applied to MAB.