Reinforcement Learning from Human Feedback (RLHF)

Concept

A training technique used to align AI models with human preferences by using human feedback to train a reward model, which then guides the AI's policy.

Mentioned in 1 video