Adam optimizer

ConceptMentioned in 2 videos

The optimization algorithm chosen for training in the lecture (recommended default for training transformers).