Nematron
Software / App
NVIDIA's dataset that used a prompt-based model to score educational value and incorporated synthetic data, resulting in 6 trillion tokens.
Mentioned in 1 video
NVIDIA's dataset that used a prompt-based model to score educational value and incorporated synthetic data, resulting in 6 trillion tokens.