KL Divergence

ConceptMentioned in 2 videos

A metric used in knowledge distillation to measure the difference between two probability distributions. It's employed to train smaller models to approximate the output distribution of larger models.