R

RLAIF (Reinforcement Learning from AI Feedback)

ConceptMentioned in 1 video

A training method where an AI model verifies and improves other AI outputs. It's considered distinct from RLHF and potentially works if verification is easier for the AI than generation.