UCB algorithm

Concept

Upper Confidence Bound, a strategy used in multi-armed bandit problems to balance exploration (trying new options) and exploitation (choosing the best known option).

Mentioned in 1 video