ZeRO
Concept
ZeRO (Zero Redundancy Optimizer), a memory optimization technique for distributed training that partitions optimizer states and gradients.
Mentioned in 2 videos
Save the 2 videos on ZeRO to your own pod.
Sign up free to keep building your knowledge base on ZeRO as more episodes are added.
Videos Mentioning ZeRO

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 7: Parallelism
Stanford Online
ZeRO (Zero Redundancy Optimizer), a memory optimization technique for distributed training that partitions optimizer states and gradients.

Stanford CS25: Transformers United V6 I The Ultra-Scale Talk: Scaling Training to Thousands of GPUs
Stanford Online
Zero-Redundancy Optimizer, a memory optimization technique for distributed training that shards optimizer states, gradients, and parameters.