RedPajama V2
BookMentioned in 1 video
An updated version of the RedPajama dataset with 30 trillion tokens and an emphasis on data quality through modular filtering.
An updated version of the RedPajama dataset with 30 trillion tokens and an emphasis on data quality through modular filtering.