Mini CPM

Book

A high-performance small language model from the Chinese open-source community, used as a case study for scaling laws and special initializations like MUP.

Mentioned in 1 video