Key Moments
François Chollet: ARC-AGI-3, Beyond Deep Learning & A New Approach To ML
Key Moments
AI's current trajectory, focused on scaling LLMs, might be suboptimal. François Chollet proposes program synthesis as a more efficient path to AGI, building a new learning substrate for truly optimal AI.
Key Insights
François Chollet's lab, NDIA, is researching program synthesis as a new branch of machine learning, aiming for models closer to optimal than current deep learning approaches.
The ARC (Abstraction and Reasoning Corpus) benchmark has evolved: V1 struggled with reasoning models (scoring <10%), V2 was saturated by agents using post-training RL loops, and V3 introduces interactive, agentic intelligence measured by exploration efficiency.
Current LLM-based coding agents achieve success due to verifiable reward signals (like unit tests) and the ability to embed execution models, not necessarily due to higher fluid intelligence.
Chollet predicts AGI by 2030, coinciding with ARC-AGI 6 or 7, and believes true AGI might be a codebase under 10,000 lines, operating on a knowledge base, reminiscent of foundational scientific principles.
Chollet advocates for exploring alternative AI approaches beyond current LLM scaling, suggesting that redirecting investment into areas like genetic algorithms or older research from the 70s/80s could yield significant breakthroughs.
For aspiring AI researchers or developers, focusing on usability, community building, and integrating AI into domain expertise is key, as AI progress is inevitable and best leveraged as an empowering tool.
The limitations of current deep learning and the search for optimality
François Chollet discusses the current AI landscape, dominated by scaling deep learning models and LLMs. He argues that while this path is yielding results and driving progress, it's not necessarily optimal. Deep learning primarily relies on fitting parameters of a model to data using gradient descent. Chollet's new venture, NDIA, aims to build a fundamentally different branch of machine learning. Instead of parametric curves, they are developing symbolic models designed to be as concise and simple as possible to explain the data. This shift necessitates a new optimization method, dubbed 'symbolic descent,' to replace gradient descent. The goal is to create machine learning engines that yield extremely concise symbolic models, leading to significantly less data required for training, much more efficient inference, and better generalization and compositionality, aligning with the minimum description length principle.
Program synthesis as an alternative foundation
NDIA's core research lies in program synthesis, which Chollet clarifies is not about code generation or coding agents. Instead, it's about rebuilding the entire machine learning stack on different foundations. Rather than adding layers on top of the existing LLM stack, NDIA is creating a new 'learning substrate' distinct from parametric deep learning. This involves finding the shortest symbolic model to explain data, a process that cannot use gradient descent. "Symbolic descent" is their proposed solution, the symbolic equivalent of gradient descent. The promise is that this approach, while significantly different and with a lower perceived chance of immediate success (around 10-15%), could lead to AI that is much closer to optimality, requiring less data and generalizing better, unlike the current 'more compute, more data' scaling paradigm.
The ARC benchmark: a barometer for evolving AI capabilities
Chollet details the evolution of the ARC (Abstraction and Reasoning Corpus) benchmark, designed to measure fundamental intelligence rather than just performance on scaled data. ARC V1 initially showed very low scores for LLMs (sub-10%), even as models scaled massively, indicating that scale alone wasn't sufficient for fluid intelligence. The breakthrough came with reasoning models (like OpenAI's 01 and 03), which demonstrated a significant step-function improvement on V1, signaling the emergence of new capabilities. ARC V2 saw saturation when agentic approaches, particularly those employing reinforcement learning loops and post-training verification mechanisms (similar to coding agents), were applied. This indicated that models could become more useful and achieve saturation through refined training paradigms and verifiable reward signals, even without necessarily becoming 'smarter' in a fluid intelligence sense.
Introducing ARC-AGI V3: measuring agentic intelligence
ARC-AGI V3 represents a significant shift, moving beyond static pattern modeling to measure 'agentic intelligence.' In V3, AI agents are placed in interactive, mini-video game-like environments without any initial instructions. They must explore, set their own goals, build a model of the environment through trial and error, and then execute plans to achieve those goals. The evaluation focuses on action efficiency, aiming for AI agents to perform with the same efficiency as humans, who can typically master these novel environments within hundreds to thousands of actions. V3 is designed to be more resistant to the 'harness' strategies used to saturate V2, featuring a private set of significantly different games that are not directly representative of performance on the public set, thus better testing fluid intelligence.
The future of AGI and the quest for fundamental principles
Chollet predicts AGI could arrive as early as 2030, potentially coinciding with ARC-AGI 6 or 7. He posits that true AGI might not require billions of parameters but could be a relatively small codebase (under 10,000 lines) operating on a large knowledge base, akin to embodying the scientific method. This vision contrasts with the current trend of massive model scaling. He believes intelligence acquisition is key, and while human intelligence is complex and messy, it offers inspiration. NDIA aims to identify fundamental principles of intelligence and build a system that optimally implements them, rather than replicating biological processes. This approach prioritizes recursive self-improvement and efficiency over sheer scale, aiming to remove humans from the continuous improvement loop.
Investing in alternative paths and foundational research
Chollet encourages the AI community to explore approaches beyond the dominant LLM paradigm. He suggests that immense resources poured into current methods could yield comparable breakthroughs if invested in other areas like genetic algorithms or older, less explored research from the 70s and 80s. He views the current unification around gradient descent and LLMs as potentially limiting. For aspiring researchers, he recommends delving into foundational, often overlooked, research and looking for approaches that inherently scale without requiring constant human engineering intervention, focusing on systems that can improve themselves without bottlenecks. The goal is to build intelligence from first principles, not just extend existing models.
Leveraging AI progress: empowerment through expertise
Addressing concerns about job displacement and AI taking over, Chollet offers an optimistic perspective. He argues that increased AI capabilities do not necessarily mean humans will be obsolete; instead, they can be empowering. The more expertise an individual has in a domain, the better they can leverage AI tools. He advises people to learn as much as possible, not only about AI but also about the specific domains they wish to apply it to. The key is to adopt a mindset of using AI progress as an opportunity and a tool for personal and professional advancement, rather than viewing it as an unstoppable force to be passively subjected to. Riding the wave of AI progress by integrating it with domain knowledge is the question to ask.
Mentioned in This Episode
●Software & Apps
●Companies
●Organizations
●Concepts
●People Referenced
Common Questions
NDIA is a new AGI research lab focused on developing a branch of machine learning that moves beyond deep learning towards greater optimality. It uses program synthesis and a new optimization method called symbolic descent to create concise, efficient, and generalizable symbolic models from data.
Topics
Mentioned in this video
An open-source project that has gained significant traction, reaching 40,000 stars.
An open-source deep learning library developed by France Chollet, noted for its simple API, usability, and community building.
An influential machine learning library for Python, which inspired Keras with its ease of use and accessibility.
The current dominant architecture in AI, which the speaker believes is a temporary stage and not the path to true optimality.
AI systems that have shown surprising success recently, primarily due to their ability to operate in domains with verifiable reward signals like code.
Mentioned as an example of deep learning guiding search, similar to the principles being applied at NDIA for program synthesis.
A research area focused on building new branches of machine learning by creating symbolic models instead of parametric ones, aiming for greater optimality and efficiency.
An alternative AI architecture that builds on the current stack, representing a slightly different approach to AI modeling.
The current dominant approach in machine learning, which the speaker contrasts with their new paradigm, arguing it's nearing its limits for achieving true optimality.
An alternative to gradient descent, designed for finding the simplest possible symbolic models of data, aiming for greater conciseness and generalization.
A principle suggesting that the shortest model of data is the most likely to generalize, which underpins the NDIA approach.
A domain that is well-suited for current AI technology due to its inherent verifiable reward signals, as are other formally verifiable domains.
An alternative AI approach that the speaker believes has significant potential and could be scaled up to achieve exciting results, potentially even enabling new scientific discoveries.
Mentioned as an example of a game where OpenAI's models (OpenAI Five) were trained for extensive periods, highlighting differences with ARC's approach.
DeepMind's early work in 2013 on solving these games using deep reinforcement learning is cited as a pioneering effort in using AI for game-playing.
More from Y Combinator
View all 577 summaries
14 minInside The Startup Reinventing The $6 Trillion Chemical Manufacturing Industry
1 minThis Is The Holy Grail Of AI
40 minIndia’s Fastest Growing AI Startup
1 minStartup School is coming to India! 🇮🇳
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free