Megatron LM
Software / App
A large-scale deep learning training library developed by NVIDIA for training large transformer language models.
Mentioned in 4 videos
Videos Mentioning Megatron LM

State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space
An open-source framework from NVIDIA for training large language models, providing useful components for distributed training.
![[Paper Club] Upcycling Large Language Models into Mixture of Experts](https://i.ytimg.com/vi/e_mkhFkKPEk/maxresdefault.jpg)
[Paper Club] Upcycling Large Language Models into Mixture of Experts
Latent Space
An open-source library on GitHub that accelerates LLM training and inference, including MoE models.

Deep Learning State of the Art (2020)
Lex Fridman
A large transformer language model from NVIDIA with 8.3 billion parameters, significantly larger than GPT-2, showing advancements in training scale.

Cursor CEO: Going Beyond Code, Superintelligent AI Agents, And Why Taste Still Matters
Y Combinator
A large-scale deep learning training library developed by NVIDIA for training large transformer language models.