Mini CPM
Book
A high-performance small language model from the Chinese open-source community, used as a case study for scaling laws and special initializations like MUP.
Mentioned in 1 video
A high-performance small language model from the Chinese open-source community, used as a case study for scaling laws and special initializations like MUP.