RLAIF

Concept

Reinforcement Learning from AI Feedback, mentioned as a post-training method.

Mentioned in 1 video