T5

Software / App

flag carrier of Turkmenistan

Mentioned in 8 videos

Videos Mentioning T5

Answer.ai & AI Magic with Jeremy Howard

Answer.ai & AI Magic with Jeremy Howard

Latent Space

A pre-trained encoder backbone suggested for fine-tuning, part of the discussion on encoder-decoder architectures.

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

Latent Space

An early general-purpose model from Google Brain, mentioned by Yi Tay as an example of the shift towards universal foundation models, even before the general public caught up.

Breaking down the OG GPT Paper by Alec Radford

Breaking down the OG GPT Paper by Alec Radford

Latent Space

Mentioned as an example of a model that uses token-based input transformations for multitask learning.

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Latent Space

A text-to-text transfer transformer model, mentioned as one of the models Elicit used in its early development and continues to use.

[Paper Club] BERT: Bidirectional Encoder Representations from Transformers

[Paper Club] BERT: Bidirectional Encoder Representations from Transformers

Latent Space

A text-to-text transfer transformer model. Mentioned briefly at the start regarding routing.

A Comprehensive Overview of Large Language Models - Latent Space Paper Club

A Comprehensive Overview of Large Language Models - Latent Space Paper Club

Latent Space

A text-to-text transfer transformer model that frames all NLP tasks as text generation.

The Magic of LLM Distillation — Rishabh Agarwal, Google DeepMind

The Magic of LLM Distillation — Rishabh Agarwal, Google DeepMind

Latent Space

A base model of 250 million parameters used in an example to illustrate how model capacity can affect the performance of different distillation methods.

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 3: Architectures

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 3: Architectures

Stanford Online

Used as an example for various architectural and hyperparameter choices, including GLUs, parallel blocks, and large feed-forward multipliers.