C4
Software / App
A dataset from which samples were rewritten in the Pratus paper to improve format and quality.
Mentioned in 2 videos
Videos Mentioning C4
![Best of 2024: Synthetic Data / Smol Models, Loubna Ben Allal, HuggingFace [LS Live! @ NeurIPS 2024]](https://i.ytimg.com/vi/AjmdDy7Rzx0/maxresdefault.jpg)
Best of 2024: Synthetic Data / Smol Models, Loubna Ben Allal, HuggingFace [LS Live! @ NeurIPS 2024]
Latent Space
A dataset from which samples were rewritten in the Pratus paper to improve format and quality.

Building an open AI company - with Ce and Vipul of Together AI
Latent Space
A large dataset from Google, mentioned as an inspiration for the RedPajama dataset.