Will there be AGI by 2028 according to Hassabis and Le?

Yes, Hassabis has maintained a 50/50 probability of achieving minimal AGI by 2028, a stance he has echoed for years.

Why do models often not say 'I don't know' and how is this being addressed?

The speakers describe a system where models are encouraged to keep trying and to self-correct rather than admit uncertainty. There is ongoing work to reward models for saying 'I don't know' to reduce hallucinations and confabulation.

What is Proto-AGI and when might it emerge?

Proto-AGI refers to an emergent, converged system bringing together language models and world models into a single, general-purpose core. The timeline suggested is after a couple more years of scaling, roughly within the next two years.

How is data and compute evolving for these models?

The discussion describes a shift from an unlimited data regime to a data-limited regime, with scaling benefits becoming more data- and architecture-driven. It also notes increasing difficulty in acquiring proprietary data and ongoing compute-cost pressures.

What does Nano Banana Pro offer?

Nano Banana Pro is highlighted as an advanced image-generation system with Gemini under the hood, capable of semantic understanding of imagery and high-quality rendering.

Key Moments

Gemini Exponential, Demis Hassabis' ‘Proto-AGI’ coming, but …

AI Explained

Science & Technology5 min read20 min video

Dec 19, 2025|90,159 views|3,241|382

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Gemini 3 Flash shines fast and capable, but proto-AGI depends on data, compute, and alignment.

Key Insights

Gemini 3 Flash achieves strong performance at near-instant speeds, outperforming prior models on many benchmarks and domains while staying significantly faster than larger pro versions.

A core tension in AI releases is the incentive to produce answers quickly; models are rarely penalized for being wrong, which fuels hallucinations and underscores the need for uncertainty handling.

Google’s DeepMind is weaving together multiple systems (Genie world models, Simmer agents, Nano Banana Pro imaging) toward a unified, scalable prototype for AGI, with a mid-term goal of convergence.

Expect a shift from pure scale to data quality and access, as well as computation costs; a data-limited regime and data acquisition bottlenecks will shape research and deployment.

The timeline theme is debated but explicit: a 50/50 chance of minimal AGI by 2028, with full AGI years beyond, while compute and data dynamics drive feasibility and risk management.

GEMINI 3 FLASH: PERFORMANCE, SPEED, AND LIMITS

Gemini 3 Flash is presented as a dramatically faster variant of Google’s Gemini line, designed to answer nearly instantly while maintaining high cognitive performance. The transcript compares Gemini 3 Flash to the summer’s Gemini 2.5 Pro across academic reasoning, visual reasoning, coding, mathematics, and general problem solving, showing that the new model not only closes the gap but often exceeds prior heavy-model performance in many domains. A notable example is the AIM mathematics benchmark, where 3 Flash scores around 95.2% versus 2.5 Pro at 88%, illustrating substantial gains even without tool-assisted capabilities. Google reportedly applied a post-training optimization targeting software engineering, which helps explain why 3 Flash can outperform the heavier 3 Pro on certain tasks. However, the speaker cautions that performance is domain-dependent; benchmarks can be optimized for specific tasks, so the real-world picture remains nuanced. Beyond raw scores, the rapid, cost-efficient reasoning of Gemini 3 Flash signals a shift toward more capable, deployable AI that can serve a broad user base with minimal latency.

THE SECRET OF MODEL RELEASES: WHY INCENTIVES SHAPE ANSWERS

A recurring theme is the incentive structure behind model outputs: models are rarely penalized for incorrect answers, and there is strong pressure to keep producing answers, thinking longer, and self-correcting rather than admitting uncertainty. The transcript highlights a 6,000-question benchmark where Gemini 3 Flash outperformed rivals on correctly answered questions, yet 91% of its mistakes came from giving an incorrect answer rather than saying I don’t know, with only 9% being partial or abstaining. This contrasts with systems like GPT-5.1, which displayed a higher tendency to say I don’t know. The host notes OpenAI’s public stance on the “epidemic” of penalizing uncertain responses and discusses the potential benefits of rewarding uncertainty. This insight underscores a fundamental tradeoff in current AI deployment: speed and accuracy versus honesty about uncertainty, a key factor in risk management, guidance reliability, and future model training approaches.

FROM LAB TO PROTO-AGI: INTEGRATING WORLDS, AGENTS, AND IMAGING

The conversation sketches a convergence path where several specialized Google DeepMind systems begin to cohere into a single, more capable agent. Genie 3 aims to simulate physics and environments with higher fidelity, including game-like worlds where the model can imagine, manipulate, and reason about physical interactions. Simmer 2 acts as a learning agent that can operate within those virtual worlds, planning long-term actions and executing commands. Nano Banana Pro represents a top-tier image generation model, now with Gemini under the hood to understand and render mechanics and materials more semantically. Hassabis envisions converging these components—language, world models, and vision—into one unified model. This synthesis would move toward a proto-AGI, a stepping stone rather than a finished AGI, with the timing tied to continued scaling and further architectural integration over the next couple of years.

COMPUTE, DATA, AND THE EXPONENTIAL PATHWAY: CHALLENGES AHEAD

The discussion acknowledges the heavy compute and data demands of modern AI, with OpenAI and Google facing ongoing tension between deploying powerful models and funding the research that underpins future capabilities. The transcript notes OpenAI’s planned compute spend—doubling roughly until 2027 or 2028 before flattening to a more linear growth afterward—and similar concerns at Google about the need to balance serving users with advancing research. It also covers data access constraints, as many firms refuse to sell proprietary datasets, potentially slowing the exponential growth curve. A future-facing idea is to simulate data-rich worlds to generate training data for proto-AGI, which could mitigate some real-world data scarcity. In short, while scaling laws incentivize larger models, researchers are increasingly factoring in data quality, accessibility, and compute distribution as limiting and shaping factors.

TIMELINES, TIMESTEPS, AND THE 2028 PROBABILITY: WHAT THE LEADERS EXPECT

A central timeline theme is Demis Hassabis’s well-known prediction of a 50/50 chance of achieving minimal AGI by 2028, defined as an artificial agent capable of performing cognitive tasks without being surprised by them. The conversation clarifies this minimal bar as a practical floor, not a statement of complete human-like intelligence. Beyond minimal AGI, the path to full AGI is framed as years later, with estimates ranging from three to six years after the minimal threshold. The interview also touches on strategic concerns about data access and compute capacity, with Greg Brockman emphasizing the cost and bottlenecks of serving users and the need to push research forward rather than divert resources solely to deployment. Taken together, the timeline is optimistic about progress yet grounded in the realities of compute growth, data availability, and alignment challenges that will define when and how AGI becomes a practical reality.

Mentioned in This Episode

●Software & Apps

●Tools

●Studies Cited

●People Referenced

Benchmark results summary

Data extracted from this episode

Benchmark	Model/Comparison	Result
AIM (math benchmark)	Gemini 2.5 Pro vs Gemini 3 Flash	88% vs 95.2% accuracy
Simplebench (spatial reasoning)	Gemini 3 Flash vs Claude Opus 4.5 / GT5 Pro	Gemini 3 Flash 61.1%
6,000-question knowledge benchmark	Gemini 3 Flash vs GPT-5.1	Gemini 3 Flash outperforms; 91% of errors due to incorrect output, 9% partial/unspecified
Codebench (coding)	GPT-5.2 Codeex vs GPT-5.1 Codeex Max	10% vs 17%
External private bench (Simple bench)	Gemini 3 Flash vs Claude Opus 4.5 / GT5 Pro	Gemini 3 Flash 61.1% vs other models on Simple bench

Common Questions

Minimal AGI is an artificial agent capable of performing all cognitive tasks typical of humans, though not yet fully human-level intelligence. Hassabis estimates it could arrive in roughly two years, with a possible range from one to five years.

Topics

Gemini 3 Flash Gemini 2.5 Pro AIM Benchmark Symbol Bench Simplebench Nano Banana Pro VO3.1 Genie 3 Simmer 2 Proto-AGI Minimal AGI AGI Timeline OpenAI Compute Spend Data-limited Regime DeepMind GBC 5.2 Codeex GPT-5.2

Mentioned in this video

Software & Apps

Gemini 3 Flash

Google's fast Gemini model; shown to outperform prior models on multiple domains and benchmarks, with post-training optimizations noted for software engineering.

Gemini 2.5 Pro

State-of-the-art model as of June this year; heavier, slower variant used for comparison.

GT5 Pro

Another model used for comparison in Simplebench contexts.

VO3.1

Google's image-to-video system mentioned as part of their simulation stack.

Genie 3

Google DeepMind's environment-simulating model that imagines worlds.

Simmer 2

Gaming companion/agent that reasons and acts within virtual 3D worlds.

Studies & Research

AIM

A difficult mathematics benchmark used to compare model accuracy, showing Gemini 3 Flash's performance against others.

GBC 5.2

OpenAI model variant referenced in coding/sciences benchmarks; compared against Gemini in discussions.

GPT-5.1

OpenAI model version cited in benchmark comparisons (e.g., Codeex/Max contexts).

GPT-5.2 Codeex

OpenAI model variant focused on coding; internal benchmarks discussed in the video.

GPT-5.1 Codeex Max

Previous Codeex variant used for comparison in the benchmarks.

Simplebench

External benchmark task involving trick questions with spatial reasoning; cited against Gemini 3 Flash.

People

Sebastian Borgode

Pre-training lead for Gemini 3; discussed data and scaling perspectives.

Demis Hassabis

Co-founder of DeepMind; discusses the proto-AGI vision and world-model convergence.

Shane Le

DeepMind co-founder discussing minimal AGI and scaling timelines.

Greg Brockman

OpenAI co-founder; discusses compute constraints and deployment tradeoffs.

Concepts

Nano Banana Pro

State-of-the-art image generation model; noted as having Gemini under the hood for semantic understanding.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free