T5

Software / App

A base model of 250 million parameters used in an example to illustrate how model capacity can affect the performance of different distillation methods.

Mentioned in 7 videos