Key Moments
AI Dev 26 x SF | Amrita Venkatraman: 3rd Era of Software Development
Want to know something specific about what's covered?
We've already dissected every moment. Ask and we will deliver (with timestamps).
Key Moments
AI agents are now capable of autonomous, parallel work, redefining software development by moving beyond simple autocomplete to complex task delegation and self-verification, potentially freeing developers to focus solely on imagination and ideas.
Key Insights
AI coding tools have evolved through three eras: next-token prediction (tab completion), agentic coding (developers steering AI), and now agent systems where parallel, autonomous agents collaborate on tasks.
Over 60% of code lines in enterprise products are now changed by AI, raising concerns about code quality, bug introduction, and managing AI-generated code reviews.
Cursor offers a range of models (Anthropic, Gemini, OpenAI, and its own) and allows developers to delegate specific tasks to specialized agents trained for particular functions, like prototyping or high-level reasoning.
Cloud agents run on remote VMs, allowing for autonomous work even when a developer's laptop is closed, and can perform self-verification using their own 'computer and mouse' to test their work, generating video artifacts of the process.
The 'automations' feature builds on cloud agents, enabling scheduled or event-triggered workflows, such as drafting post-mortems from incident data or cleaning up feature flags, integrated with tools like PagerDuty, DataDog, and GitHub.
A system of agents, running uninterrupted for one week, generated approximately three million lines of code and successfully built a functional browser from scratch, showcasing the potential for tackling large, ambitious projects.
From keystrokes to conversational teammates: The evolution of AI in coding
Software development has undergone a profound transformation, moving from manual coding in editors like Sublime Text, where developers decomposed changes into logic and translated them for computers, to a new paradigm where development is redefined by AI. Amrita Venkatraman outlines this evolution through three distinct 'eras.' The first, 'tab' completion, focused on next-token prediction, offering simple code suggestions. This was followed by the second era of 'agentic coding,' where AI takes the driver's seat, and developers steer and correct its actions. We are now entering the 'third era,' characterized by agent systems. In this phase, teams of cloud-based agents collaborate, work in parallel on dedicated virtual machines, and perform increasingly autonomous tasks, operating akin to persistent teammates rather than one-off assistants. This shift allows developers to communicate with computers more like colleagues, moving beyond the complexity of managing millions of lines of code across thousands of files. The ultimate promise is that developers will be constrained only by their imagination and ideas, not by time or capacity.
The rise of agentic development and enterprise adoption
The transition from simple code completion to more sophisticated AI assistance has been rapid. Around the first quarter of last year, models became significantly more usable for coding tasks. Initially, 'tab' completion was dominant, but by late last year, agent requests surpassed tab accepts, indicating a shift towards developers actively engaging AI in an agentic workflow. While 'tab' still has its niche, particularly for ML engineers in Python or Jupyter environments, agent-driven development has largely taken over. This trend is mirrored in enterprise adoption, where over 60% of code lines in enterprise products are now influenced by AI. This widespread integration raises critical questions regarding code quality measurement, bug prevention, and the feasibility of engineers reviewing the vast amounts of code generated by AI. Addressing these concerns is crucial as AI continues to become more integral to the software development lifecycle (SDLC).
Orchestrating AI teams: Delegating tasks with specialized models
Cursor facilitates advanced AI-driven development by enabling developers to work with teams of agents, much like human teams are organized. This involves a planner agent delegating tasks to various worker agents or sub-planners. A key aspect of this system is the ability to leverage a wide array of AI models, including those from Frontier Labs, Anthropic, Gemini, and OpenAI, alongside Cursor's in-house models. Models are not monolithic; they are trained differently and excel at specific tasks. Cursor's own models, for instance, are primarily trained on code and are excellent for rapid prototyping, iteration, debugging, and code exploration. More general models from other providers are better suited for high-level reasoning and planning. Developers can strategically select and delegate based on task requirements. Amrita notes her personal preference for using GPT-55 for strategizing and then delegating to Cursor's Composer 2 models for code generation. The design team at Cursor, for example, favors Gemini models. This model delegation and selection approach capitalizes on each model's strengths, making the development process more interesting, useful, and efficient.
Cloud agents: Persistent, autonomous workers in a distributed world
The 'third era' of software development is marked by a significant shift from local agent execution to cloud-based agents. Cursor's cloud agents run on remote Virtual Machines (VMs), either on Cursor's infrastructure or self-hosted. This allows agents to continue working even when a developer's laptop is closed, ideal for end-of-day tasks or for distributed teams across different time zones to pick up where others left off. Cloud agents can autonomously operate 'computers of their own,' testing functionalities manually without explicit human direction. These 'cloud agent artifacts' capture the agent's actions, often in video format, providing a self-verified record of its work. This is akin to a colleague showing their work. This feature is crucial for automated testing and verification, ensuring that AI-generated changes are correct before integration. Cursor has seen exponential growth in the use of cloud agents, with projects like implementing network policy controls for sandbox processes involving up to 10,000 lines of code written by agents and subsequently reviewed by humans. Organizing commits for pull requests and streamlining code reviews are also tasks automated by these cloud agents.
Building complex systems: The browser from scratch example
The power of agent systems is vividly demonstrated by ambitious projects that were previously out of reach for automated tooling. One notable example is a system of agents that ran uninterrupted for a week, generating approximately three million lines of code. The extraordinary outcome was the construction of a functional browser from scratch, a task known to be extremely complex. This showcases how agent systems, through delegation across multiple workflows, can now tackle long and intricate projects that were once the exclusive domain of highly skilled human teams. This achievement signifies a major leap forward in what can be accomplished with AI-powered development.
Seamless integration: Slack bots, planning, and self-verification artifacts
Cursor integrates AI agents into everyday developer workflows through various features. For instance, a Slack integration allows users to request tasks directly from direct messages. An agent can automatically identify the correct repository and update public documentation, notifying the user upon completion. This process, entirely executed on a remote VM, frees the developer from needing to be actively involved. Another demonstration involves local feature planning followed by cloud execution. Developers can create a plan, using sub-agents like a 'code explorer' and a 'devil's advocate' for detailed analysis and risk assessment, before delegating the implementation to a cloud agent. Crucially, these cloud agents can produce 'artifacts' – videos or screenshots of their work. This self-verification process, where an agent tests its own code using simulated user interactions, provides tangible proof of functionality and allows for quick iteration based on user feedback. For example, a request to add a search bar to a changelog website resulted in an agent performing the task, testing it, and providing video evidence of the search functionality working correctly.
Automating the mundane: Scheduled and event-driven workflows
Building on cloud agents, Cursor introduces 'automations,' which are essentially scheduled or event-triggered cloud agents. These are designed to automate tedious or time-consuming aspects of a developer's job. Examples include automated drafting of post-mortems for incidents, triggered by tools like PagerDuty and integrated with services like Atlassian and DataDog to pull relevant logs and charts. Another powerful automation is the cleanup of feature flags, a task that often becomes complex and error-prone. These automations can be configured using templates or custom natural language prompts and can be triggered by various events like Git operations (PR opened/closed), Slack messages, or alerts from systems like Sentry or DataDog. The output of such automations can be a PR ready for review, significantly reducing manual overhead and allowing developers to focus more on creative and strategic work.
Mentioned in This Episode
●Software & Apps
●Companies
●People Referenced
Working with AI Agents in Software Development
Practical takeaways from this episode
Do This
Avoid This
Common Questions
The third era of software development, as discussed by Amrita Venkatraman, is characterized by agent systems where teams of AI agents work autonomously and collaboratively, akin to human teams. This allows developers to focus more on ideas rather than being constrained by time or capacity.
Topics
Mentioned in this video
A leading AI research company whose models are available within Cursor for various tasks.
An error tracking and performance monitoring tool that can serve as a trigger for Cursor automations.
A platform for code hosting and version control, integrated with Cursor for PR management and agent workflows.
An incident management platform that Cursor's automations can integrate with to trigger post-mortems.
A monitoring and analytics platform whose logs and alerts can be integrated with Cursor automations.
A software company whose products, like Confluence, integrate with Cursor automations for creating post-mortem documents.
A DevOps platform that can be used as a trigger for Cursor automations, similar to GitHub.
A communication platform that integrates with Cursor, allowing users to interact with agents and trigger workflows.
An early code editor that the speaker used before transitioning to more modern development tools.
Used for writing plans due to its strength in strategizing and thinking. It's noted as being better for planning than Composer 2.
A collaboration tool where Cursor plans can be published and shared for team feedback.
A family of AI models from Google integrated with Cursor, favored by the design team.
Cursor's in-house model, noted for being faster and better at writing code, used for delegation.
A company whose integration with Cursor is mentioned as important for the future of PR reviews.
A company developing an AI-powered code editor designed to assist software engineers.
Refers to early Generative Pre-trained Transformer models that made AI coding usable.
An OpenAI model mentioned in the context of strategizing and thinking, likely referring to GPT-4.
A model that can be used for sub-agents, offering high reasoning abilities at a potentially lower cost.
A productivity and project management tool where Cursor plans can be shared for collaboration.
More from DeepLearningAI
View all 94 summaries
29 minAI Dev 26 x SF | Paul Everitt: The Shift to Agentic Engineering
26 minAI Dev 26 x SF | Brandon Waselnuk: Building the Context Engine AI Agents Need
27 minAI Dev 26 x SF | Diamond Bishop: The Next 100 Agents. Building the Agent Native Office
32 minAI Dev 26 x SF | Jerry Liu: My Agent Can't Read a PDF?
Ask anything from this episode.
Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.
Get Started Free