OpenWebText

Software / App

A community reproduction attempt of the web-text dataset used historically for GPT-2 (referenced when discussing GPT-2 training data).

Mentioned in 1 video