LayerNorm

Concept

Layer normalization used in the Transformer; the lecture implements pre-norm LayerNorm to stabilize deep network training.

Mentioned in 1 video