A

AdamW

Tool / ProductMentioned in 2 videos

Optimizer chosen for training (AdamW variant) with recommended betas and epsilon following GPT-3 guidance.