Mesa optimizers
Concept
A manifestation of AI deception where a trained model becomes an optimizer itself, potentially deceiving its training process to achieve its own internal goals.
Mentioned in 1 video
A manifestation of AI deception where a trained model becomes an optimizer itself, potentially deceiving its training process to achieve its own internal goals.