Key Moments
AI Dev 26 x SF | Paul Everitt: The Shift to Agentic Engineering
Want to know something specific about what's covered?
We've already dissected every moment. Ask and we will deliver (with timestamps).
Key Moments
Agentic engineering promises AI-driven productivity boosts, but rising costs and trust issues threaten to make it a "challenger disaster" if not managed with a renewed focus on core engineering principles.
Key Insights
A 2023 study found 95% of companies saw little to no durable organizational value from AI investments, suggesting code generation speed-up yields only about 10% improvement.
95% of Europeans expressed concern about AI's impact, highlighting potential governance and sovereignty issues distinct from other regions.
Only 3% of developers expressed high confidence in AI-generated code accuracy in 2023, indicating a significant trust gap.
Layoffs have been linked to stock price increases, but phrases like "AI washing" and the emergence of roles like "AI HR managers" suggest a complex job market shift rather than outright elimination.
Agentic engineering aims to build "the thing that builds the thing," shifting the focus from direct coding to systems design, scaffolding, and augmenting human capabilities.
The concept of 'red-green testing' for agents, where a broken test is written first, helps define success and guides agents to mimic desired testing styles and outcomes.
The productivity paradox and current industry challenges
Despite a surge of AI advancements, the software engineering industry faces significant challenges in translating individual AI tool productivity gains into durable organizational value. Early insights from a mid-2023 study indicated that 95% of companies failed to extract substantial benefits from their AI investments. While AI can accelerate coding, it contributes to only a fraction of the overall software engineering process. Studies suggest that speed-ups in coding yield improvements closer to 10%, not the often-touted 10x. Code itself was rarely the primary bottleneck; rather, the broader spectrum of software engineering activities presents far greater complexity. This disconnect between perceived AI potential and tangible organizational wins points to a need for a more strategic approach beyond simply generating more code. The industry is grappling with issues beyond raw output, including quality concerns, rising costs, and a fundamental lack of trust in AI-generated results. This creates a "missed opportunity" to leverage AI for genuine innovation rather than just profit margin squeezing.
Quality, cost, and trust emerge as critical roadblocks
Concerns about quality are paramount, with a noted "50% bad number" for phase defect rates and a fear of "challenger disaster" if humans are removed from the loop, potentially leading to catastrophic errors like accidental table drops. The cost of AI tools, specifically token consumption, is becoming unsustainable, with pricing already beginning to shift upwards. Furthermore, trust remains a significant hurdle; a 2023 survey revealed that only 3% of developers had high confidence in the accuracy of AI-generated code. This lack of trust, coupled with the tendency for employees to "game the system" when provided with new tools, contributes to organizational friction. The "dark factory" pattern, where human oversight is minimized, raises alarms, especially in regulated industries and for European workers who, at 84%, express caution. This environment fuels a "fear of being obsolete" among the workforce.
Agentic engineering: a new framework for the AI era
In response to these challenges, the concept of "agentic engineering" is emerging as a proposed solution. Coined and popularized by figures like Andrej Karpathy and Simon Willison, it represents a shift from "vibe coding"—haphazardly relying on AI—to a more disciplined engineering practice. The core idea is to build "the thing that builds the thing," reminiscent of early software engineering principles. This approach reframes the developer's role from direct code creation to systems design, scaffolding, and maximizing human leverage. OpenAI's "harness engineering" model exemplifies this, focusing on building the infrastructure that enables AI to build solutions. The emphasis is on augmenting, not replacing, humans, recognizing that the human element remains critical, albeit potentially a bottleneck to be managed strategically.
Key practices within agentic engineering
Agentic engineering in practice involves several key disciplines. "Spec-driven development" ensures human input steers AI work and aligns stakeholders. Rigorous "evaluations" are crucial to determine if generated code is good, efficient, and within budget, moving beyond simple "button go clicky" functionality. This necessitates data scientists to analyze model performance, as different LLMs have varied characteristics. "Harness engineering," focusing on owning the AI's memory and tools, is vital. "Tooling" enables agents to execute code securely in sandboxes, a capability being developed by companies like Cloudflare. The "red-green testing" pattern is adapted for agents, where writing a broken test first sets clear expectations and allows the agent to learn and mimic the desired testing style, effectively defining success for the AI.
Rethinking codebases and building for AI collaboration
Working with large, legacy codebases necessitates a re-evaluation of modularity. Principles like "Ojiic engineering" might guide the reorganization of code to support parallel and specialized sub-agents. The development of "QA agents" aims to augment human testers by prepping work or identifying instrumentation needs, allowing them to collect data as they deem fit. This integration requires developers to write code that acts as glue, connecting AI capabilities with the outside world. "Observability," encompassing both general system and AI-specific metrics, becomes critical to monitor these complex interactions. Furthermore, "orchestration" principles are essential for designing coherent, architecturally sound systems, drawing parallels to system design fundamentals that ensure resilience and maintainability.
Leadership, culture, and the call to arms
For leaders, managing the shift to agentic engineering requires addressing the "fear of being obsolete" within teams. The message to stakeholders should pivot from simply automating current processes and replacing workers to fostering innovation and augmenting human capabilities. This shift requires a return to the discipline of engineering, emphasizing its scientific and systematic nature, harkening back to foundational principles. Grady Booch advocates for the development of "agentic patterns," urging the community to define this new discipline. The ultimate goal is to move beyond the hype cycle of "vibe coding" and establish agentic engineering as a robust, principled practice, enabling the creation of novel solutions and finding purpose and joy in the work.
Mentioned in This Episode
●Software & Apps
●Companies
●Organizations
●Concepts
●People Referenced
Agentic Engineering: Dos and Don'ts
Practical takeaways from this episode
Do This
Avoid This
AI Impact on Development and Team Sentiment
Data extracted from this episode
| Metric | Finding | Source/Context |
|---|---|---|
| Organizational Value from AI | Limited compared to individual value. | Speaker observation |
| AI Productivity Improvement | 10% (not 10x) | DX study follow-on |
| Defect Rate in AI-Generated Code | 50% bad rate perceived. | Talk on defects |
| Confidence in AI Accuracy (Last Year) | 3% | Developer survey |
| Employee Rebellion/Rebellion Sentiment | Engineers outright rebelling. | Fortune Magazine article |
| Management vs. Engineer Sentiment Gap (Mental Health) | 67 points. | Fortune Magazine article |
| European Trust in AI | 84% skeptical ('Eh'). | European sentiment data |
| Work Enabled by AI (Anthropic) | 27% of work using 'god box' capabilities. | Anthropic example |
| Impact of 'Mega Layoffs' on Stock Price | Stock price goes up. | Observation on industry trend |
| Percentage Involved in Agent-assisted Coding | Most hands raised. | Audience poll |
| Percentage Shipping Agent Code to Production | About two-thirds. | Audience poll |
| Proportion of Large Codebases | 20% (estimated). | Audience poll |
| Number of AI Orchestration Startups | Approx. 500 in Series X. | Speaker estimate |
Common Questions
Agentic engineering is a paradigm shift in software development where instead of just building software, engineers build the systems or 'agents' that build the software. It focuses on creating more leverage and augmenting human capabilities rather than just automating current tasks.
Topics
Mentioned in this video
A newsletter writer who has been warning about the inability to keep subsidizing AI tokens and the rising costs.
Introduced the concept of 'AI washing' and is seen as a counter-example to the idea that jobs are solely being eliminated, suggesting job creation might be occurring.
Referred to as the 'developer AI developer whisperer' and co-founder of Django, who is concerned about quality issues with AI in production and is writing a book on agentic engineering.
Associated with modern software engineering and definitions that emphasize engineering as more than just coding.
A Nobel Prize winner mentioned for his work on the productivity problem in economics and software engineering, highlighting the missed opportunities with AI.
Creator of UML, who discussed the third golden age of software engineering and emphasized that coding was never the whole picture, also pushing back on agentic engineering being just software engineering.
Author of a blog post on Agentic Engineering, who also discussed 'vibe coding'.
Mentioned among companies that conducted 'mega layoffs' in response to the evolving industry landscape, with stock prices reportedly increasing afterward.
A company where a staff engineer discussed agentic engineering in a taxonomy, contributing to the understanding of the field.
Mentioned alongside Claude in the context of agents needing to run tools and execute code within secure sandboxes.
Mentioned as an example where AI enabled 27% of work that would not have been possible otherwise, contrasting with pure profit-driven AI use.
Listed as a company that underwent 'mega layoffs', a move often associated with a rise in stock price.
Mentioned in the context of 'harness engineering', where they are building the 'thing that builds the thing' and pushing the evolution towards agentic engineering.
The company Paul Everitt works for as a developer advocate, known for its developer tools and being privately held, profitable, and European.
Mentioned as an example where annual performance goals were missed in three months due to AI-related issues, highlighting trust and gaming the system.
Identified as a company that experienced 'mega layoffs', with the speaker noting an increase in stock price as a potential reason.
Mentioned as one of the companies that conducted 'mega layoffs', with the subsequent stock price increase being a debated outcome.
Used as an analogy to describe the unpredictable and exciting journey into the future of software engineering.
A podcast where Simon Willison was featured, discussing topics relevant to agentic engineering and testing strategies.
Cited for reporting on engineers' rebellion and the significant gap between management and engineer sentiment regarding AI and mental health.
Published an article with a taxonomy related to agentic engineering, providing a structured view of the evolving field.
An AI model mentioned as discussing 'code mode' and the capability for agents to run tools securely in sandboxes.
A Python company mentioned in relation to harness engineering and owning memory for agents.
A product from JetBrains, mentioned as an example of a large codebase where modularity concepts for agentic engineering are being explored.
An organization mentioned in relation to a course on spec-driven development and as the potential host for creating the discipline of agentic engineering.
A Rust-based subset of Python developed by Pydantic for running tool code in sandboxes with low latency.
A company developing a Rust-based subset of Python called Monty for running tool code in sandboxes, and also discussed in relation to system and AI observability.
More from DeepLearningAI
View all 94 summaries
22 minAI Dev 26 x SF | Andrew K. Davies: Deterministic Memory: How to Build an AI That Cannot Lie
26 minAI Dev 26 x SF | Brandon Waselnuk: Building the Context Engine AI Agents Need
27 minAI Dev 26 x SF | Diamond Bishop: The Next 100 Agents. Building the Agent Native Office
32 minAI Dev 26 x SF | Jerry Liu: My Agent Can't Read a PDF?
Ask anything from this episode.
Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.
Get Started Free