Self-Attention
Concept
A variant of attention where a model attends to different positions of a single sequence to compute a representation of the same sequence, central to the Transformer architecture.
Mentioned in 2 videos
Videos Mentioning Self-Attention

Deep Learning State of the Art (2019)
Lex Fridman
A mechanism that allows the encoder to selectively look at other parts of the input sequence to better form hidden representations, improving the encoding process.

Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet | Lex Fridman Podcast #434
Lex Fridman
A variant of attention where a model attends to different positions of a single sequence to compute a representation of the same sequence, central to the Transformer architecture.