BERT

Software / App

deep learning artificial neural network language model

Mentioned in 17 videos

Videos Mentioning BERT

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka

Latent Space

Mentioned as an example of models that researchers were fine-tuning in late 2019, before the focus shifted entirely to foundational models.

Breaking down the OG GPT Paper by Alec Radford

Breaking down the OG GPT Paper by Alec Radford

Latent Space

Mentioned as a model that emerged after the GPT-1 paper.

[Paper Club] Embeddings in 2024: OpenAI, Nomic Embed, Jina Embed, cde-small-v1 - with swyx

[Paper Club] Embeddings in 2024: OpenAI, Nomic Embed, Jina Embed, cde-small-v1 - with swyx

Latent Space

The architecture used by Nomic Embed, noted as a standard in training processes. The speaker expressed surprise that models are still largely updated versions of BERT.

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Latent Space

A transformer-based language model, mentioned in comparison to models like T5, used in early stages of NLP.

Production AI Engineering starts with Evals

Production AI Engineering starts with Evals

Latent Space

A neural network-based technique for natural language processing pre-training, which significantly accelerated text-based information extraction and began to cannibalize Impira's computer vision-based approach.

Is finetuning GPT4o worth it?

Is finetuning GPT4o worth it?

Latent Space

A language model mentioned in the context of OpenAI's progress and early AI models.

Information Theory for Language Models: Jack Morris

Information Theory for Language Models: Jack Morris

Latent Space

A language model that Jack Morris was playing with and that was popular in 2019.

Beating GPT-4 with Open Source Models - with Michael Royzen of Phind

Beating GPT-4 with Open Source Models - with Michael Royzen of Phind

Latent Space

A foundational transformer-based language model developed by Google, mentioned as an encoder model used by Michael Royzen before Longformer.

Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36

Yann LeCun: Deep Learning, ConvNets, and Self-Supervised Learning | Lex Fridman Podcast #36

Lex Fridman

A language model that utilizes self-supervised learning, cited as an example of successful NLP models.

Jeremy Howard: fast.ai Deep Learning Courses and Research | Lex Fridman Podcast #35

Jeremy Howard: fast.ai Deep Learning Courses and Research | Lex Fridman Podcast #35

Lex Fridman

A transformer-based machine learning technique for natural language processing pre-training developed by Google, demonstrating transfer learning.

Rajat Monga: TensorFlow | Lex Fridman Podcast #22

Rajat Monga: TensorFlow | Lex Fridman Podcast #22

Lex Fridman

A language model developed by Google, representing the kind of cutting-edge research enabled by TensorFlow.

Oriol Vinyals: Deep Learning and Artificial General Intelligence | Lex Fridman Podcast #306

Oriol Vinyals: Deep Learning and Artificial General Intelligence | Lex Fridman Podcast #306

Lex Fridman

A neural network-based technique for natural language processing pre-training, mentioned as an idea coming from NLP.

Deep Learning Basics: Introduction and Overview

Deep Learning Basics: Introduction and Overview

Lex Fridman

Google's BERT (Bidirectional Encoder Representations from Transformers) model, a breakthrough in natural language processing.

Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI

Anthropic Head of Pretraining on Scaling Laws, Compute, and the Future of AI

Y Combinator

A model mentioned as an example of pre-training objectives considered before auto-regressive modeling became dominant.

The Utility of Interpretability — Emmanuel Amiesen

The Utility of Interpretability — Emmanuel Amiesen

Latent Space

An early Transformer encoder-decoder model used as an example to illustrate how top layers of models can overfit on training objectives, necessitating the use of deeper layers for more general language understanding.

Transformers Explained: The Discovery That Changed AI Forever

Transformers Explained: The Discovery That Changed AI Forever

Y Combinator

A series of models developed using only the encoder part of the transformer architecture for tasks like masked language modeling.

AI and the Future of Law: The 10 Year "Overnight" Success Story

AI and the Future of Law: The 10 Year "Overnight" Success Story

Y Combinator

A natural language processing model whose paper inspired CaseText to explore large language models for legal applications early on.