Bandit algorithm

Concept

A type of problem in reinforcement learning where an agent must choose between multiple options (arms) with unknown reward probabilities, aiming to maximize cumulative reward.

Mentioned in 1 video

Videos Mentioning Bandit algorithm

Monte Carlo Tree Search - Computerphile

Computerphile

A type of problem in reinforcement learning where an agent must choose between multiple options (arms) with unknown reward probabilities, aiming to maximize cumulative reward.