R

RLHF (Reinforcement Learning from Human Feedback)

ConceptMentioned in 1 video

A training method for AI models where a reward model is trained from human feedback. It requires collecting a significant amount of human labels.