A Baby Robot’s Guide To Reinforcement Learning
Thompson Sampling using Conjugate Priors
Multi-Armed Bandits: Part 5b
Published in
18 min readMar 9, 2021
Recap
Baby Robot has entered a charging room containing 5 different power sockets. Each of these sockets returns a slightly different amount of charge. We want to get Baby Robot charged up in the minimum amount of…