Mesa optimizers
ConceptMentioned in 1 video
A manifestation of AI deception where a trained model becomes an optimizer itself, potentially deceiving its training process to achieve its own internal goals.
A manifestation of AI deception where a trained model becomes an optimizer itself, potentially deceiving its training process to achieve its own internal goals.