Byte Pair Encoding (BPE)
Concept
Tokenization algorithm (merging common byte/byte-pair sequences) used to build large vocabularies (~100k tokens).
Mentioned in 1 video
Tokenization algorithm (merging common byte/byte-pair sequences) used to build large vocabularies (~100k tokens).