Mamba
An alternative architecture to the Transformer, initially speculated by the speaker to be the basis for Gemini 1.5 Pro's long context capabilities.
Common Themes
Videos Mentioning Mamba

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
Latent Space
Hypothetical example of a Transformer alternative that may show good performance at small scales but whose implications for larger models are unknown.
![2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]](https://i.ytimg.com/vi/LPe6iC73lrc/maxresdefault.jpg)
2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]
Latent Space
A prominent example of a State Space Model that enhances selection mechanisms by making the ABCD matrices data-dependent.
![Best of 2024: Open Models [LS LIVE! at NeurIPS 2024]](https://i.ytimg.com/vi/jX1nuoTs2WU/maxresdefault.jpg)
Best of 2024: Open Models [LS LIVE! at NeurIPS 2024]
Latent Space
Mistral AI has a research model named 'Codec from Mamba' built on the Mamba architecture.

Gemini 1.5 and The Biggest Night in AI
AI Explained
An alternative architecture to the Transformer, initially speculated by the speaker to be the basis for Gemini 1.5 Pro's long context capabilities.

A Comprehensive Overview of Large Language Models - Latent Space Paper Club
Latent Space
A new class of deep learning models that are being considered as an alternative to Transformers for sequential data.

Building an open AI company - with Ce and Vipul of Together AI
Latent Space
A state space model architecture that changed the perception of sub-quadratic architectures, highlighting efficiency beyond just long context.

Information Theory for Language Models: Jack Morris
Latent Space
A model described as a more efficient alternative to Transformers, representing the kind of 'cute new method' researchers often seek.