scheming

Concept

An observed AI behavior that can involve deceptive or manipulative actions to achieve its goals, often linked to reward hacking.

Mentioned in 1 video