Key Moments
AI Dev 26 x SF | David Park: Building Production Grade Agentic Systems with ADE
Want to know something specific about what's covered?
We've already dissected every moment. Ask and we will deliver (with timestamps).
Key Moments
Agentic systems can extract structured data from complex documents, but visual grounding at a pixel level is crucial for auditability in regulated industries, costing $1M per year to scale.
Key Insights
Agentic Document Extraction (ADE) provides a foundational layer for agentic systems, powering context engineering and multimodal pipelines across industries.
The ADE platform offers three core APIs: parse for understanding document layout, extract for pulling specific fields based on a schema, and split for classifying and separating different document types within a packet.
Visual grounding, which highlights the exact source of extracted information down to the pixel level, is presented not just as a feature but as a contractual requirement for auditability and traceability in production systems, especially in financial services.
A five-agent pipeline for document processing includes agents for parsing, splitting, field extraction, decision-making, and managerial review, orchestrated using tools like Google ADK.
Best practices for building agentic systems involve a four-layer approach: deterministic shells around stochastic cores at the agent level, enforced contracts for inter-agent communication, controlled execution in the orchestration layer, and grounded, validated data in the context layer.
Case studies show that implementing ADE in banking led to up to a 60% reduction in manual review time, saving millions annually, while a pharmaceutical company achieved a 2x productivity gain and 50% faster dossier generation.
Documents as the foundational context for agentic systems
Documents are the often-overlooked yet critical starting point for many real-world AI systems, serving as the source of truth for crucial decisions. David Park from Landing AI emphasizes that Agentic Document Extraction (ADE) acts as a foundational layer within larger agentic architectures. This technology is vital for powering context engineering, enabling multimodal pipelines, and orchestrating complex processes across various industries. The core idea is that documents, whether structured or unstructured like PDFs, Word docs, or even Excel files, can be transformed into structured data through ADE. This structured data then becomes the input for agents to make decisions that are defendable and auditable, a necessity in heavily regulated sectors like financial services, healthcare, and life sciences. The pattern of extracting structured data from unstructured documents and applying business rules for explainable outcomes is highly versatile and applicable to use cases such as KYC, contract review, equity research, and insurance claims processing.
Landing AI's agentic document extraction platform
The Agentic Document Extraction (ADE) platform from Landing AI is built upon a document-pretrained transformer that is purely visual, enabling it to understand document flow, structure, and semantics. It offers three core APIs: Parse, which identifies and processes everything on a PDF; Extract, which allows users to define a schema (e.g., 10-15 fields for legal or lending) to pull specific data points, thereby structurizing unstructured input; and Split, which classifies and separates different document types within a single document packet (like a loan packet containing an income statement and bank statement) based on defined rules and content. Beyond these core functions, ADE is integrating agentic tools such as PII identification, data redaction, and summarization agents to enhance its capabilities.
The critical role of visual grounding for auditability
A key differentiator and requirement for production systems, particularly in financial services, is visual grounding. This feature within ADE highlights the exact location of every piece of extracted information, even down to a pixel level on tables or within text. This ensures full traceability and auditability, which is a non-negotiable for large banks and other regulated entities. Unlike relying solely on LLMs, which can introduce unpredictability, ADE provides a proprietary method for generating confidence scores and ensures that every value can be traced back to its source document. This is paramount because in regulated industries, decisions must be defensible six months or even years later. Implementing auditability from the outset with features like visual grounding is essential, as it is incredibly difficult to retrofit later into a system.
Structuring context for predictable agent decisions
Instead of simply feeding raw documents to agents and expecting them to figure everything out, ADE provides structured, grounded context. This 'context engineering' allows agents to operate deterministically and predictably. The system parses, splits, and extracts relevant fields from documents, creating a structured dataset. This structured context is then fed to downstream agents, ensuring they are not operating in a vacuum. For many use cases, visual grounding is not merely a feature but a contractual obligation to ensure auditability. By grounding the data and providing structured context, the risk of errors related to inaccurate information extraction or LLM hallucinations is significantly reduced, paving the way for more reliable agentic workflows.
An agentic pipeline for automated document review and decision-making
A typical agentic system built around ADE might involve a five-agent pipeline. This pipeline begins with parsing and splitting documents simultaneously. Field extraction is then performed using a defined schema. A decision agent makes an initial determination based on the extracted data and business logic. This is followed by a manager agent that reviews the decision and supporting evidence, allowing for human oversight in high-stakes cases. A chat layer, often integrated with Retrieval-Augmented Generation (RAG), allows users to interact with the results naturally, interrogating decisions and eliciting reasoning. For orchestration, tools like Google ADK are used, providing structured agent execution and context management. Claude is employed for the reasoning and interaction layer, explaining decisions and enabling natural language dialogues. The overall architecture separates the deterministic parts of the system (data processing, rule application) from the stochastic "brains" (LLMs), ensuring reliability and controlled execution.
Hierarchical agent structure for ownership and control
The system employs a hierarchical agent structure, featuring a manager agent at the top with specialized agents reporting to it. This design is crucial in production environments because an agent must ultimately 'own' the final decision. The manager agent is empowered to approve, deny, or escalate to human review, while subordinate agents focus on producing evidence to support these outcomes. Orchestration, handled by tools like Google ADK, manages how agents exchange context and execute tasks sequentially or in parallel. This approach ensures a clear control flow, preventing the unpredictable emergent behavior sometimes seen in LLMs. By separating upstream agents (handling parsing and extraction) from downstream agents (applying business logic to structured data), and confining model calls to specific interfaces with validation and retries, the system achieves greater predictability, reliability, and cost-efficiency by keeping most operations deterministic.
Best practices: Schemas before prompts and layered hardening
A key principle for building robust agentic systems is 'schemas before prompts.' Instead of starting with prompts and dealing with string debugging, developers should define input and output schemas for each agent. The prompt then becomes the method by which the agent fulfills its schema contract. This approach, facilitated by ADE's field extraction API, makes systems easier to test, reason about, and debug. Furthermore, best practices for production-grade agent systems involve four layers of hardening: 1) Agent Level: Wrapping models in a harness with guards against prompt injection and field validation ('deterministic shells around stochastic cores'). 2) Agent-to-Agent Communication: Enforcing typed handoffs and validation between agents to ensure contracts are met. 3) Orchestration Layer: Controlling execution with retries, backoff strategies, and explicit failure states, rather than silent failures. 4) Context and Data Layer: Grounding all information with validated ranges, confidence scores, and visual grounding, using caching to manage context windows and separate data ingestion from reasoning.
Case studies: Proven value in banking and pharmaceuticals
The practical application of this agentic architecture has yielded significant results across industries. In a tier-one global bank, the system automated complex 'know your customer' processes for wealth management clients, handling multilingual and regulatory documents. This resulted in up to a 60% reduction in manual review time, saving hundreds of analyst hours weekly and translating to millions of dollars in annual savings. The system improved both speed and cost-effectiveness while maintaining an audit trail for every decision. Similarly, a Fortune 50 pharmaceutical company used ADE to automate market access emissions, achieving a 2x productivity gain and enabling a 50% faster generation of global reimbursement dossiers. This allowed highly skilled research scientists to redirect their efforts to higher-value strategic work, also resulting in millions of dollars in annual impact for a single use case. These examples demonstrate the horizontal applicability of the agentic architecture across different industries and document types.
Mentioned in This Episode
●Software & Apps
●Companies
●People Referenced
Building Production-Grade Agentic Systems: Best Practices
Practical takeaways from this episode
Do This
Avoid This
Common Questions
An agentic system uses AI agents to process and make decisions based on unstructured documents. These systems aim for scalability and production readiness, incorporating elements like state management, retries, and intentional constraints on large language models.
Topics
Mentioned in this video
Company developing an agentic document extraction platform for enterprises, particularly in regulated industries like financial services and healthcare.
Platform where the demo was open-sourced, providing users with access to the code.
Company that introduced the concept of separating the brain and the hand in AI systems.
A web framework used for the backend of the agentic system.
A file format that can be processed by the agentic document extraction platform.
An alternative cloud platform that could be used for agent orchestration, alongside Google ADK.
A framework that can be used for agent orchestration, mentioned as an alternative to Google ADK.
An agent orchestration framework mentioned as an alternative to Google ADK.
A data validation library used with FastAPI for the backend.
A common document format that can be processed by the agentic document extraction platform.
A document format that can be processed by the agentic document extraction platform.
LLM used for reasoning, interaction layer, explaining decisions, and enabling natural language interaction.
Programming language used to implement the explicit business rules within the agentic system.
More from DeepLearningAI
View all 94 summaries
27 minAI Dev 26 x SF | Diamond Bishop: The Next 100 Agents. Building the Agent Native Office
26 minAI Dev 26 x SF | Brandon Waselnuk: Building the Context Engine AI Agents Need
29 minAI Dev 26 x SF | Paul Everitt: The Shift to Agentic Engineering
32 minAI Dev 26 x SF | Jerry Liu: My Agent Can't Read a PDF?
Ask anything from this episode.
Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.
Get Started Free