Self-Attention

Concept

A variant of attention where a model attends to different positions of a single sequence to compute a representation of the same sequence, central to the Transformer architecture.

Mentioned in 2 videos

Videos Mentioning Self-Attention

Deep Learning State of the Art (2019)

Lex Fridman

A mechanism that allows the encoder to selectively look at other parts of the input sequence to better form hidden representations, improving the encoding process.

Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet | Lex Fridman Podcast #434

Lex Fridman

A variant of attention where a model attends to different positions of a single sequence to compute a representation of the same sequence, central to the Transformer architecture.