Attention Mechanism

Concept

An improvement over encoder-decoder architectures that allows the model to look back at specific parts of the input sequence during decoding, improving translation accuracy.

Mentioned in 2 videos