FineWeb

Concept

Dataset curated by Hugging Face used as an example pretraining corpus (filtered, ~44 TB).

Mentioned in 3 videos

Save the 3 videos on FineWeb to your own pod.

Sign up free to keep building your knowledge base on FineWeb as more episodes are added.

Get Started Free