RagMix

Concept

A regression-based mixing method that trains small proxy models with different data mixtures, fits a regression to map weights to loss, and then optimizes for large-scale model training.

Mentioned in 1 video