Reinforcement Learning from Human Feedback (RLHF)

ConceptMentioned in 1 video

A training technique used to align AI models with human preferences by using human feedback to train a reward model, which then guides the AI's policy.