Key Moments
Is AI About to “Eat Everything”? (It’s Not.)
Key Moments
AI progress in programming tasks is accelerating due to better models and sophisticated 'coding harnesses,' but this doesn't signal an impending AI takeover, as it's a specific application, not general intelligence growth.
Key Insights
The METR chart tracks the longest duration software tasks AI models, when combined with coding harnesses, can complete with at least 50% success, not general AI capability.
AI model improvement shifted from pre-training scaling in 2024 to post-training and tuning for specific tasks like programming, leading to recent performance gains.
The recent exponential-like increase on the METR chart is significantly driven by the development of complex, hand-coded 'coding harnesses' and expert systems, not just LLM advancements.
The METR chart's task durations are abstract measures of difficulty for 'low context' programmers, not precise indicators of what high-context professionals can achieve.
Progress in AI applications is better modeled as exploring navigable 'tributaries' (specific applications) rather than a general 'water level' rise, meaning progress in one area doesn't predict progress in others.
Transhumanist and existential risk communities, driven by extrapolating exponentials, have unduly influenced the discourse around AI, leading to exaggerated fears of an AI 'eating everything' scenario.
Understanding the METR time horizon chart
Recent online discourse, amplified by figures like Gary Marcus, has seized upon the METR (AI Safety and Evaluation Organization) time horizon chart, interpreting its upward trend as evidence of an imminent "intelligence explosion" and AI's tendency to "eat everything." The chart, which shows data points rising sharply from 2025 onwards, has fueled sensationalist tweets claiming AI power is doubling rapidly and that human input will soon become a liability. These interpretations often compare the METR chart to graphs predicting the rise of artificial superintelligence (ASI), creating a sense of urgency and unease. This summary aims to critically examine what the METR chart actually measures and what its trends signify, debunking the more extreme claims.
What the METR chart actually measures
Cal Newport clarifies that the METR chart does not measure the general capability of AI models. Instead, it focuses on a specific suite of well-defined software tasks that can be solved by writing or analyzing computer code. For each task, human programmers were timed, and the geometric mean of their completion times was recorded to label the task's 'human duration.' Subsequently, large language models (LLMs) combined with 'coding harnesses' (programs that help the LLM solve challenges, similar to Claude Code or Cursor) are evaluated. The chart plots each model against the *longest duration task* it could complete successfully at least 50% of the time, correlating this with the model's release date. For instance, a model plotting at '12 hours' means it can complete a specific coding task that took humans, on average, 12 hours to finish, at least half the time. This is a specific benchmark for programming tasks, not a universal measure of AI's potential.
The limitations of the measured durations
Crucially, the specific numerical durations represented on the chart are not precise indicators of AI capability relative to human professionals. METR itself acknowledges the difficulty in assigning precise meaning to these times. The 'human time duration' could include significant overhead for understanding the task, learning new techniques, or researching unfamiliar concepts. METR specifies that their time horizon is closer to what a 'low context person' (like a new hire or remote contractor) can accomplish, rather than what a high-context professional can do in their daily job. Therefore, these durations should be viewed as abstract measures of programming task difficulty, indicating that a model can tackle a task of a certain complexity, rather than signifying it can perform X hours of work a human could.
The shift from pre-training to post-training and harnesses
The dramatic upturn on the METR chart, particularly from late 2024 onwards, is explained by a fundamental shift in AI development strategy. For years, the focus was on pre-training LLMs – long, expensive processes involving massive datasets to imbue models with general knowledge. This approach, while improving general capabilities (e.g., GPT-2 to GPT-4), hit a wall around the summer of 2024, where simply scaling up pre-training yielded diminishing returns in obvious new capabilities. This led to a pivot towards post-training: taking pre-trained models and fine-tuning them on very narrow, high-quality datasets using techniques like reinforcement learning. Computer programming emerged as a prime target for this post-training due to its structured nature. This fine-tuning improved the LLMs' ability to generate longer, more coherent, and correct code. Concurrently, significant effort was invested in developing sophisticated 'coding harnesses'—programs that integrate LLMs with tools for planning, execution, and verification, mirroring professional developer workflows. These harnesses often incorporate substantial amounts of hand-coded logic and 'expert systems,' drawing on decades of programming expertise.
The role of coding harnesses in recent gains
The exponential-like leap observed in the METR chart is not solely due to LLM improvements but is heavily influenced by the advancement of these coding harnesses, especially from late 2025 and early 2026. These harnesses act as sophisticated scaffolding, enabling LLMs to tackle multi-step, complex programming tasks that require planning, debugging, and interaction with development environments. The leakage of Anthropic's Claude Code's source code revealed the extensive human effort and traditional AI techniques embedded within these harnesses. This combination of fine-tuned LLMs capable of better planning and code generation, coupled with these robust, hand-coded harnesses, has created a powerful synergistic effect. This breakthrough represents a significant commercial success, demonstrating that specific, economically viable applications like professional-grade programming tools can be built upon AI technology.
The 'tributary' model versus 'rising water'
To counter the 'AI eats everything' narrative, Newport proposes a better mental model for AI progress: that of a river with navigable 'tributaries.' Instead of a general 'water level' rising to solve all problems (the 'rising water' model), AI progress is about identifying and exploring specific application areas (tributaries). Progress in one tributary, like software development where significant effort has been invested in custom tools and harnesses, does not automatically imply similar navigable pathways exist in unrelated areas (e.g., email management, which may prove to be much shallower or filled with rapids). This 'tributary' model highlights that the development of useful AI applications is a hard exploration process, requiring custom tools and significant effort, and success in one area is specific rather than generalizable.
The influence of transhumanism and existential risk communities
The exaggerated fears surrounding AI are also attributed to the influence of the transhumanist and existential risk (x-risk) communities. These groups, often intersecting with rationalists, tend to see the world through the lens of exponentials and their potential for radical societal transformation – either utopian or dystopian. They are drawn to the perceived exponential growth in AI capabilities, extrapolating current trends to predict inevitable AGI or ASI and significant societal upheaval. This worldview, rooted in eschatological thinking, shapes their interpretation of data like the METR chart as evidence of impending doom or salvation. This influential, albeit extreme, perspective has seeped into the discourse surrounding AI, contributing to widespread anxiety and the sensationalist narrative of AI 'eating everything'.
A call for a more grounded approach to AI
Newport argues that AI companies need to distance themselves from these cult-like communities and their extreme rhetoric. Instead of framing AI progress in terms of existential threats or utopian transcendence, companies should focus on clearly communicating the practical benefits and limitations of their tools. Just as the advent of electric cars was met with clear-eyed assessment of their utility, AI tools, including advanced programming assistants, should be discussed pragmatically. The METR chart, while impressive in its demonstration of progress in software development tools, says 'nothing about the fate of humanity or AI more generally.' The call is to treat AI as a technology, celebrating its useful applications without falling into the trap of wild extrapolation or succumbing to the anxieties fueled by fringe ideologies.
Mentioned in This Episode
●Software & Apps
●Companies
●Organizations
●Books
●Concepts
●People Referenced
Common Questions
The Meter chart measures the duration of software tasks that Large Language Models (LLMs) combined with coding harnesses can complete successfully at least 50% of the time, using human task completion time as a benchmark.
Topics
Mentioned in this video
An organization that released an AI Safety and Evaluation update, including a famous AI time horizon chart.
A large language model that represents advancements in AI capabilities, used as a reference point in the Epoch Capabilities Index.
An early large language model in the progression of models, existing before the significant jumps in capability seen in later models.
An AI model that marked the first point where tasks could be completed on the Meter chart, indicating a move beyond pure pre-training.
An influential figure in tech and AI, mentioned in the context of individuals who need to distance AI company narratives from cult-like communities.
An individual who rounded up concerned responses to Meter's latest AI time horizon chart in his newsletter.
Mentioned as an analogy for exploring new territories, like navigating river tributaries, when discussing AI application development.
A key figure in the AI industry who, along with others like Sam Altman and Elon Musk, is urged to distance themselves from transhumanist and existential risk communities.
A prominent figure in the AI industry, called upon to separate AI company messaging from extreme communities like transhumanists and existential risk proponents.
His work on exponentials influenced the transhumanist movement, leading to ideas about exponential increases in computing power and uploading consciousness.
More from Cal Newport
View all 299 summaries
44 minCan This Simple Change Save My Distracted Brain?
27 minIs the AI Doom Fever Breaking? (It’s About Time!)
82 minAm I Optimizing the Wrong Things?
34 minIs AI About to Automate Every Office Job? (Not a Chance)
Ask anything from this episode.
Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.
Get Started Free