Discovering Language Model Behaviors with Model-Written Evaluations
BookMentioned in 1 video
A paper that investigates the behaviors of large language models, including their political leanings and stated desires, as they scale.
A paper that investigates the behaviors of large language models, including their political leanings and stated desires, as they scale.