Alignment Tuning

ConceptMentioned in 1 video

A tuning process to ensure model behavior aligns with human values like harmlessness, honesty, and helpfulness.