Transformer Architecture
8 video summaries
Videos About Transformer Architecture

Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI
Y Combinator
![[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz](https://i.ytimg.com/vi/8BN9CdIYaqc/maxresdefault.jpg)
[Paper Club] Molmo + Pixmo + Whisper 3 Turbo - with Vibhu Sapra, Nathan Lambert, Amgadoz
Latent Space

Oriol Vinyals: Deep Learning and Artificial General Intelligence | Lex Fridman Podcast #306
Lex Fridman
![[Paper Club] Upcycling Large Language Models into Mixture of Experts](https://i.ytimg.com/vi/e_mkhFkKPEk/maxresdefault.jpg)
[Paper Club] Upcycling Large Language Models into Mixture of Experts
Latent Space
![[Paper Club] BERT: Bidirectional Encoder Representations from Transformers](https://i.ytimg.com/vi/V64q3p7DNjc/maxresdefault.jpg)
[Paper Club] BERT: Bidirectional Encoder Representations from Transformers
Latent Space

How to train a Million Context LLM — with Mark Huang of Gradient.ai
Latent Space

Breaking down the OG GPT Paper by Alec Radford
Latent Space

A Comprehensive Overview of Large Language Models - Latent Space Paper Club
Latent Space