AdaGrad

Concept

An optimization algorithm that adaptively scales the learning rate for each parameter based on the historical sum of squared gradients.

Mentioned in 1 video