François Chollet: How We Get To AGI

Y CombinatorY Combinator
Science & Technology4 min read35 min video
Jul 3, 2025|136,352 views|3,042|204
Save to Pod

Key Moments

TL;DR

Current AI scaling is insufficient for AGI; focus shifts to test-time adaptation and combining abstraction types.

Key Insights

1

The cost of compute has been a primary driver of AI progress, but scaling current models alone is not enough for AGI.

2

There's a critical distinction between memorized skills and fluid general intelligence, which involves adapting to novel situations.

3

Test-time adaptation (TTA) represents a significant shift, enabling models to learn and adapt during inference, showing promise for fluid intelligence.

4

Intelligence is best defined as the efficiency of operationalizing past information to deal with future novelty and uncertainty, not just skill acquisition.

5

Human intelligence relies on a combination of Type 1 (value-centric) and Type 2 (program-centric) abstractions, a synergy AI needs to replicate.

6

Future AGI development requires moving beyond deep learning's strength in Type 1 abstraction towards discrete program search for Type 2 capabilities and invention.

THE LIMITATIONS OF THE SCALING PARADIGM

For years, the dominant paradigm in AI has been scaling up deep learning models, particularly large language models, driven by the falling cost of compute and the availability of vast datasets. This approach, often referred to as 'pre-training scaling,' showed predictable improvements on benchmarks as models and data increased. However, this progress primarily reflected an enhancement of memorized skills and static inference, rather than true fluid general intelligence. The core issue was mistaking benchmark performance for genuine understanding and adaptability, a limitation highlighted by benchmarks like the Abstraction and Reasoning Corpus (ARC).

WHAT IS TRUE INTELLIGENCE?

François Chollet posits that intelligence is not merely the ability to perform tasks but rather the efficiency with which one operationalizes past information to navigate novelty and uncertainty. This contrasts with the traditional view of AI as achieving human-level task performance, often framed by corporate goals of automating economically valuable tasks. Chollet emphasizes that intelligence is a process of dealing with new situations and building new capabilities, akin to a road-building company rather than just a static road network. This definition moves beyond crystallized behavior and skills, focusing on the dynamic capacity to adapt and invent.

THE SHIFT TO TEST-TIME ADAPTATION

The AI research community has seen a significant pivot towards 'test-time adaptation' (TTA). This paradigm shift focuses on creating models capable of changing their own state and behavior dynamically during inference. Unlike pre-training, which loads knowledge statically, TTA involves learning and adapting on the fly. Techniques like test-time training and program synthesis fall under this umbrella, enabling AI systems to modify their responses based on specific encountered data. This approach has demonstrated significant progress on benchmarks like ARC, indicating a move towards more fluid and adaptive intelligence.

REDEFINING AND MEASURING INTELLIGENCE: THE ARC BENCHMARK

To address the limitations of existing benchmarks, the Abstraction and Reasoning Corpus (ARC) was developed. Unlike traditional tests that can be 'gamed' through memorization, ARC tasks are unique and require on-the-fly problem-solving using core, implicit knowledge that even young children possess. ARC aims to measure fluid intelligence by presenting novel problems that cannot be solved by simply recalling stored patterns. While ARC1 initially served to highlight the inadequacy of scaling, ARC2 and the upcoming ARC3 are designed to be more sensitive, probing compositional generalization and even agency, providing a more nuanced measure of AI's progress towards AGI.

THE DUAL NATURE OF ABSTRACTION IN INTELLIGENCE

Human intelligence is characterized by the interplay of two types of abstraction: Type 1 (value-centric) and Type 2 (program-centric). Type 1, driven by continuous functions and comparisons, underlies perception, intuition, and pattern recognition—areas where modern machine learning excels. Type 2, involving discrete program comparison and structural matching, is crucial for human reasoning, planning, and invention. While current AI, particularly transformers, are adept at Type 1 abstraction, they struggle with Type 2 tasks like sorting or arithmetic, indicating a critical gap in achieving general intelligence.

THE FUTURE: COMBINING ABSTRACTIONS AND SEARCH FOR INVENTION

Achieving AGI requires moving beyond current AI capabilities by effectively combining Type 1 and Type 2 abstractions. This involves leveraging discrete program search, guided by deep learning-driven intuition, to overcome the combinatorial explosion inherent in Type 2 reasoning. The goal is to create 'programmer-like' meta-learners that can synthesize novel programs by combining deep learning modules for perception (Type 1) and algorithmic modules for reasoning (Type 2). This approach, emphasizing reusability through a shared library of abstractions and efficient search, is the focus of new research labs aiming to build AI capable of independent invention and accelerating scientific discovery.

Common Questions

Static skills refer to memorized, task-specific abilities, while fluid intelligence is the ability to understand and adapt to entirely new problems on the fly, without prior preparation.

Topics

Mentioned in this video

More from Y Combinator

View all 103 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free