DistilBERT
Software / App
A smaller, faster, and lighter version of BERT, developed by Hugging Face, used in a practical demonstration.
Mentioned in 2 videos
Save the 2 videos on DistilBERT to your own pod.
Sign up free to keep building your knowledge base on DistilBERT as more episodes are added.
Videos Mentioning DistilBERT
![[Paper Club] BERT: Bidirectional Encoder Representations from Transformers](https://i.ytimg.com/vi/V64q3p7DNjc/maxresdefault.jpg)
[Paper Club] BERT: Bidirectional Encoder Representations from Transformers
Latent Space
A smaller, faster, and lighter version of BERT, developed by Hugging Face, used in a practical demonstration.

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training
Stanford Online
A distilled version of BERT, saving approximately 60% of parameters while retaining 97% of performance, used as an example of successful distillation.