Adam optimizer
Concept
The optimization algorithm chosen for training in the lecture (recommended default for training transformers).
Mentioned in 2 videos
The optimization algorithm chosen for training in the lecture (recommended default for training transformers).