Constitutional AI
A technique for alignment where a constitution of rules guides the model's behavior, which can be applied during training or as a system prompt.
Common Themes
Videos Mentioning Constitutional AI

The Agent Reasoning Interface: Claude, ChatGPT Canvas, Tasks, Operator — with Karina Nguyen, OpenAI
Latent Space
A paper discussing methods for creating model completions that adhere to specific principles, relevant to behavioral design and shaping model behavior.

Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI
Y Combinator
A technique for alignment where a constitution of rules guides the model's behavior, which can be applied during training or as a system prompt.

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit
Latent Space
A framework developed by Anthropic for training AI models, which Elicit implemented to create a better summarizer faithful to the source text.

The Origin and Future of RLHF: the secret ingredient for ChatGPT - with Nathan Lambert
Latent Space
Anthropic's approach to alignment, where a second AI model evaluates a first model's outputs based on 'constitutional principles,' effectively modifying the RLHF setup with AI-provided critiques.

Why AI Agents Don't Work (yet) - with Kanjun Qiu of Imbue
Latent Space
An approach developed by Anthropic for training AI models using AI-generated feedback based on a set of principles.

⚡️Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect
Latent Space
A framework discussed in relation to Anthropic's safety efforts, where LLMs are trained to adhere to a set of principles, often through reward modeling.