Discovering Language Model Behaviors with Model-Written Evaluations
Book
A paper that investigates the behaviors of large language models, including their political leanings and stated desires, as they scale.
Mentioned in 1 video
A paper that investigates the behaviors of large language models, including their political leanings and stated desires, as they scale.