GPT-4 tokenizer whitespace improvements (design choice)

Software / App

Design choice in CL100k/GPT‑4 tokenizer to group repeated spaces (improves code density), explicitly shown with Python indentation examples.

Mentioned in 1 video