Mamba
research group
Common Themes
Videos Mentioning Mamba

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka
Latent Space
Hypothetical example of a Transformer alternative that may show good performance at small scales but whose implications for larger models are unknown.
![2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]](https://i.ytimg.com/vi/LPe6iC73lrc/maxresdefault.jpg)
2024 in Post-Transformer Architectures: State Space Models, RWKV [Latent Space LIVE! @ NeurIPS 2024]
Latent Space
A prominent example of a State Space Model that enhances selection mechanisms by making the ABCD matrices data-dependent.
![Best of 2024: Open Models [LS LIVE! at NeurIPS 2024]](https://i.ytimg.com/vi/jX1nuoTs2WU/maxresdefault.jpg)
Best of 2024: Open Models [LS LIVE! at NeurIPS 2024]
Latent Space
Mistral AI has a research model named 'Codec from Mamba' built on the Mamba architecture.

Gemini 1.5 and The Biggest Night in AI
AI Explained
An alternative architecture to the Transformer, initially speculated by the speaker to be the basis for Gemini 1.5 Pro's long context capabilities.

A Comprehensive Overview of Large Language Models - Latent Space Paper Club
Latent Space
A new class of deep learning models that are being considered as an alternative to Transformers for sequential data.

Building an open AI company - with Ce and Vipul of Together AI
Latent Space
A state space model architecture that changed the perception of sub-quadratic architectures, highlighting efficiency beyond just long context.

Information Theory for Language Models: Jack Morris
Latent Space
A model described as a more efficient alternative to Transformers, representing the kind of 'cute new method' researchers often seek.

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 1: Overview, Tokenization
Stanford Online
A state space model or linear attention model that has become popular in recent years.

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 4: Attention Alternatives
Stanford Online
A family of state space models derived from state space theory, with Mamba 2 being an elaboration of linear attention by adding a gating mechanism.