R

RLHF

ConceptMentioned in 1 video

Reinforcement Learning from Human Feedback, Paul Cristiano is identified as its inventor.