KL Divergence

Concept

A metric used in knowledge distillation to measure the difference between two probability distributions. It's employed to train smaller models to approximate the output distribution of larger models.

Mentioned in 3 videos