Megatron LM
Software / App
A large-scale deep learning training library developed by NVIDIA for training large transformer language models.
Mentioned in 4 videos
Save the 4 videos on Megatron LM to your own pod.
Sign up free to keep building your knowledge base on Megatron LM as more episodes are added.
Videos Mentioning Megatron LM

State of the Art: Training 70B LLMs on 10,000 H100 clusters
Latent Space
An open-source framework from NVIDIA for training large language models, providing useful components for distributed training.
![[Paper Club] Upcycling Large Language Models into Mixture of Experts](https://i.ytimg.com/vi/e_mkhFkKPEk/maxresdefault.jpg)
[Paper Club] Upcycling Large Language Models into Mixture of Experts
Latent Space
An open-source library on GitHub that accelerates LLM training and inference, including MoE models.

Deep Learning State of the Art (2020)
Lex Fridman
A large transformer language model from NVIDIA with 8.3 billion parameters, significantly larger than GPT-2, showing advancements in training scale.

Cursor CEO: Going Beyond Code, Superintelligent AI Agents, And Why Taste Still Matters
Y Combinator
A large-scale deep learning training library developed by NVIDIA for training large transformer language models.