GPT-2

Software / App

2019 text-generating large language model

Mentioned in 26 videos

Videos Mentioning GPT-2

Is RL a dead end? – Dario Amodei

Is RL a dead end? – Dario Amodei

Dwarkesh Clips

OpenAI language model demonstrating broader generalization from internet-scale data

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Supervise the Process of AI Research — with Jungwon Byun and Andreas Stuhlmüller of Elicit

Latent Space

An early generative language model used by Elicit in its initial stages of research and product development.

Agents @ Work: Dust.tt — with Stanislas Polu

Agents @ Work: Dust.tt — with Stanislas Polu

Latent Space

Mentioned as a previous generation of models in the research path at OpenAI.

Why Google failed to make GPT-3 -- with David Luan of Adept

Why Google failed to make GPT-3 -- with David Luan of Adept

Latent Space

An earlier version of OpenAI's language model, discussed as a pivotal paper and a demonstration of focused research efforts.

A Comprehensive Overview of Large Language Models - Latent Space Paper Club

A Comprehensive Overview of Large Language Models - Latent Space Paper Club

Latent Space

An earlier version of OpenAI's GPT models, known for its generative capabilities.

Deep Dive into LLMs like ChatGPT

Deep Dive into LLMs like ChatGPT

Andrej Karpathy

Building AGI in Real Time (OpenAI Dev Day 2024)

Building AGI in Real Time (OpenAI Dev Day 2024)

Latent Space

Mentioned as a point of reference for the exponential growth in AI capabilities, suggesting that GPT-01 represents a similar 'scale moment' in AI development.

llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE

llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE

Latent Space

A transformer-based language model developed by OpenAI. llm.c was able to train GPT-2 weights, achieving competitive results compared to PyTorch.

Language Agents: From Reasoning to Acting — with Shunyu Yao of OpenAI, Harrison Chase of LangGraph

Language Agents: From Reasoning to Acting — with Shunyu Yao of OpenAI, Harrison Chase of LangGraph

Latent Space

An earlier version of OpenAI's language models, noted for its size and potential risks at the time.

Is finetuning GPT4o worth it?

Is finetuning GPT4o worth it?

Latent Space

An earlier language model from OpenAI, mentioned in the context of the evolution of AI models.

The Origin and Future of RLHF: the secret ingredient for ChatGPT - with Nathan Lambert

The Origin and Future of RLHF: the secret ingredient for ChatGPT - with Nathan Lambert

Latent Space

An early OpenAI language model, mentioned in the context of the 'Learning to Summarize' experiment where initial RLHF techniques were applied.

Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI

Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI

Latent Space

Mentioned as a model that likely would not have benefited from reasoning paradigms, highlighting the need for a certain baseline capability in models before applying techniques like chain-of-thought.

FlashAttention-2: Making Transformers 800% faster AND exact

FlashAttention-2: Making Transformers 800% faster AND exact

Latent Space

An earlier language model from OpenAI, cited as a point where the company recognized the potential of scaling.

Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94

Ilya Sutskever: Deep Learning | Lex Fridman Podcast #94

Lex Fridman

A large language model developed by OpenAI, based on the Transformer architecture, with 1.5 billion parameters, trained on a massive dataset of web text, capable of generating highly realistic and coherent text.

Jeremy Howard: fast.ai Deep Learning Courses and Research | Lex Fridman Podcast #35

Jeremy Howard: fast.ai Deep Learning Courses and Research | Lex Fridman Podcast #35

Lex Fridman

A large language model developed by OpenAI, following GPT, also exemplifying transfer learning in NLP.

Deep Learning State of the Art (2020)

Deep Learning State of the Art (2020)

Lex Fridman

An OpenAI language model discussed for its potential dangers, release strategy, and its limitations in common sense reasoning, though noted for impressive text generation.

Oriol Vinyals: DeepMind AlphaStar, StarCraft, and Language | Lex Fridman Podcast #20

Oriol Vinyals: DeepMind AlphaStar, StarCraft, and Language | Lex Fridman Podcast #20

Lex Fridman

A large language model by OpenAI, cited as an impressive example of current deep learning capabilities in language generation.

Greg Brockman: OpenAI and AGI | Lex Fridman Podcast #17

Greg Brockman: OpenAI and AGI | Lex Fridman Podcast #17

Lex Fridman

An advanced language model whose non-release of the full version is discussed as a test case for responsible disclosure in AI, highlighting concerns about fake news and biased content generation.

Imagining Education with Generative AI

Imagining Education with Generative AI

MIT Open Learning

Information Theory for Language Models: Jack Morris

Information Theory for Language Models: Jack Morris

Latent Space

An OpenAI model that Jack Morris found interesting, though BERT was more popular at the time.

Ray Kurzweil: Singularity, Superintelligence, and Immortality | Lex Fridman Podcast #321

Ray Kurzweil: Singularity, Superintelligence, and Immortality | Lex Fridman Podcast #321

Lex Fridman

An earlier large-scale language model that did not work as well as its successor, GPT-3, highlighting the exponential growth needed for such models to be effective.

Personal benchmarks vs HumanEval - with Nicholas Carlini of DeepMind

Personal benchmarks vs HumanEval - with Nicholas Carlini of DeepMind

Latent Space

An earlier language model that Carlini initially viewed as a toy before recognizing the practical utility of later models.

Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344

Noam Brown: AI vs Humans in Poker and Games of Strategic Negotiation | Lex Fridman Podcast #344

Lex Fridman

A large language model that, along with GPT-3, highlighted the rapid progress in AI and inspired the ambitious Diplomacy project.

Scott Aaronson: Computational Complexity and Consciousness | Lex Fridman Podcast #130

Scott Aaronson: Computational Complexity and Consciousness | Lex Fridman Podcast #130

Lex Fridman

Predecessor to GPT-3, mentioned to illustrate the significant advancement in capability of GPT-3 with increased network size and training data.

Page 1 of 2Next