How does Landing AI's platform handle unstructured documents?

Landing AI uses a visual transformer model for document understanding, including flow and structure. It offers APIs for parsing, extracting specific fields based on a schema, and splitting documents. Features like PII identification and summarization are also added.

What is 'visual grounding' and why is it important for enterprise applications?

Visual grounding highlights the exact source of information within a document, ensuring traceability and auditability down to the pixel level. This is critical for regulated industries like financial services, where decisions must be defensible.

How does the proposed agentic system architecture differ from typical LLM applications?

The system emphasizes context engineering over prompt engineering, providing structured, grounded context to agents. It separates the LLM (the 'brain') from deterministic logic and execution (the 'hand'), ensuring predictable and auditable outcomes rather than relying solely on the LLM's reasoning.

What role does a manager agent play in the decision-making process?

The manager agent sits at the control layer and has the authority to approve, deny, or escalate decisions. It applies explicit business rules and logic, overriding the LLM's output if necessary (e.g., flagging fraud or missing documents), ensuring human oversight and compliance.

What are the key best practices for building robust agentic systems?

Key practices include: deterministic shells around stochastic cores (wrapping models with safeguards), enforcing contracts between agents, controlling execution flow with retries and clear failure states, and grounding all data with traceability. Schemas should precede prompts in development.

Can agentic systems provide significant ROI in traditional industries?

Yes, case studies show substantial value. A global bank achieved up to 60% reduction in manual review time for KYC processes, saving millions annually. A pharmaceutical company saw a 2x productivity gain and 50% faster dossier generation.

Key Moments

AI Dev 26 x SF | David Park: Building Production Grade Agentic Systems with ADE

DeepLearning.AI

Education7 min read30 min video

May 21, 2026|412 views|14

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Agentic systems can extract structured data from complex documents, but visual grounding at a pixel level is crucial for auditability in regulated industries, costing $1M per year to scale.

Key Insights

Agentic Document Extraction (ADE) provides a foundational layer for agentic systems, powering context engineering and multimodal pipelines across industries.

The ADE platform offers three core APIs: parse for understanding document layout, extract for pulling specific fields based on a schema, and split for classifying and separating different document types within a packet.

Visual grounding, which highlights the exact source of extracted information down to the pixel level, is presented not just as a feature but as a contractual requirement for auditability and traceability in production systems, especially in financial services.

A five-agent pipeline for document processing includes agents for parsing, splitting, field extraction, decision-making, and managerial review, orchestrated using tools like Google ADK.

Best practices for building agentic systems involve a four-layer approach: deterministic shells around stochastic cores at the agent level, enforced contracts for inter-agent communication, controlled execution in the orchestration layer, and grounded, validated data in the context layer.

Case studies show that implementing ADE in banking led to up to a 60% reduction in manual review time, saving millions annually, while a pharmaceutical company achieved a 2x productivity gain and 50% faster dossier generation.

Documents as the foundational context for agentic systems

Documents are the often-overlooked yet critical starting point for many real-world AI systems, serving as the source of truth for crucial decisions. David Park from Landing AI emphasizes that Agentic Document Extraction (ADE) acts as a foundational layer within larger agentic architectures. This technology is vital for powering context engineering, enabling multimodal pipelines, and orchestrating complex processes across various industries. The core idea is that documents, whether structured or unstructured like PDFs, Word docs, or even Excel files, can be transformed into structured data through ADE. This structured data then becomes the input for agents to make decisions that are defendable and auditable, a necessity in heavily regulated sectors like financial services, healthcare, and life sciences. The pattern of extracting structured data from unstructured documents and applying business rules for explainable outcomes is highly versatile and applicable to use cases such as KYC, contract review, equity research, and insurance claims processing.

Landing AI's agentic document extraction platform

The Agentic Document Extraction (ADE) platform from Landing AI is built upon a document-pretrained transformer that is purely visual, enabling it to understand document flow, structure, and semantics. It offers three core APIs: Parse, which identifies and processes everything on a PDF; Extract, which allows users to define a schema (e.g., 10-15 fields for legal or lending) to pull specific data points, thereby structurizing unstructured input; and Split, which classifies and separates different document types within a single document packet (like a loan packet containing an income statement and bank statement) based on defined rules and content. Beyond these core functions, ADE is integrating agentic tools such as PII identification, data redaction, and summarization agents to enhance its capabilities.

The critical role of visual grounding for auditability

A key differentiator and requirement for production systems, particularly in financial services, is visual grounding. This feature within ADE highlights the exact location of every piece of extracted information, even down to a pixel level on tables or within text. This ensures full traceability and auditability, which is a non-negotiable for large banks and other regulated entities. Unlike relying solely on LLMs, which can introduce unpredictability, ADE provides a proprietary method for generating confidence scores and ensures that every value can be traced back to its source document. This is paramount because in regulated industries, decisions must be defensible six months or even years later. Implementing auditability from the outset with features like visual grounding is essential, as it is incredibly difficult to retrofit later into a system.

Structuring context for predictable agent decisions

Instead of simply feeding raw documents to agents and expecting them to figure everything out, ADE provides structured, grounded context. This 'context engineering' allows agents to operate deterministically and predictably. The system parses, splits, and extracts relevant fields from documents, creating a structured dataset. This structured context is then fed to downstream agents, ensuring they are not operating in a vacuum. For many use cases, visual grounding is not merely a feature but a contractual obligation to ensure auditability. By grounding the data and providing structured context, the risk of errors related to inaccurate information extraction or LLM hallucinations is significantly reduced, paving the way for more reliable agentic workflows.

An agentic pipeline for automated document review and decision-making

A typical agentic system built around ADE might involve a five-agent pipeline. This pipeline begins with parsing and splitting documents simultaneously. Field extraction is then performed using a defined schema. A decision agent makes an initial determination based on the extracted data and business logic. This is followed by a manager agent that reviews the decision and supporting evidence, allowing for human oversight in high-stakes cases. A chat layer, often integrated with Retrieval-Augmented Generation (RAG), allows users to interact with the results naturally, interrogating decisions and eliciting reasoning. For orchestration, tools like Google ADK are used, providing structured agent execution and context management. Claude is employed for the reasoning and interaction layer, explaining decisions and enabling natural language dialogues. The overall architecture separates the deterministic parts of the system (data processing, rule application) from the stochastic "brains" (LLMs), ensuring reliability and controlled execution.

Hierarchical agent structure for ownership and control

The system employs a hierarchical agent structure, featuring a manager agent at the top with specialized agents reporting to it. This design is crucial in production environments because an agent must ultimately 'own' the final decision. The manager agent is empowered to approve, deny, or escalate to human review, while subordinate agents focus on producing evidence to support these outcomes. Orchestration, handled by tools like Google ADK, manages how agents exchange context and execute tasks sequentially or in parallel. This approach ensures a clear control flow, preventing the unpredictable emergent behavior sometimes seen in LLMs. By separating upstream agents (handling parsing and extraction) from downstream agents (applying business logic to structured data), and confining model calls to specific interfaces with validation and retries, the system achieves greater predictability, reliability, and cost-efficiency by keeping most operations deterministic.

Best practices: Schemas before prompts and layered hardening

A key principle for building robust agentic systems is 'schemas before prompts.' Instead of starting with prompts and dealing with string debugging, developers should define input and output schemas for each agent. The prompt then becomes the method by which the agent fulfills its schema contract. This approach, facilitated by ADE's field extraction API, makes systems easier to test, reason about, and debug. Furthermore, best practices for production-grade agent systems involve four layers of hardening: 1) Agent Level: Wrapping models in a harness with guards against prompt injection and field validation ('deterministic shells around stochastic cores'). 2) Agent-to-Agent Communication: Enforcing typed handoffs and validation between agents to ensure contracts are met. 3) Orchestration Layer: Controlling execution with retries, backoff strategies, and explicit failure states, rather than silent failures. 4) Context and Data Layer: Grounding all information with validated ranges, confidence scores, and visual grounding, using caching to manage context windows and separate data ingestion from reasoning.

Case studies: Proven value in banking and pharmaceuticals

The practical application of this agentic architecture has yielded significant results across industries. In a tier-one global bank, the system automated complex 'know your customer' processes for wealth management clients, handling multilingual and regulatory documents. This resulted in up to a 60% reduction in manual review time, saving hundreds of analyst hours weekly and translating to millions of dollars in annual savings. The system improved both speed and cost-effectiveness while maintaining an audit trail for every decision. Similarly, a Fortune 50 pharmaceutical company used ADE to automate market access emissions, achieving a 2x productivity gain and enabling a 50% faster generation of global reimbursement dossiers. This allowed highly skilled research scientists to redirect their efforts to higher-value strategic work, also resulting in millions of dollars in annual impact for a single use case. These examples demonstrate the horizontal applicability of the agentic architecture across different industries and document types.

Mentioned in This Episode

●Software & Apps

●Companies

●People Referenced

Building Production-Grade Agentic Systems: Best Practices

Practical takeaways from this episode

Do This

Wrap every model in a harness with prompt injection guards, context limits, and field validation.

Enforce contracts between agents with typed handoffs and validation on inputs/outputs.

Control execution with retries, backoff, and explicit failure states.

Ground all data using visual grounding, validated ranges, and propagate confidence scores.

Separate data ingestion and document understanding from agentic reasoning through caching.

Use schemas before prompts to define agent contracts and simplify debugging.

Implement a hierarchical agent structure where a manager agent owns the final decision.

Ground every piece of information, tying values back to their source.

Avoid This

Do not simply trust the model at face value; constrain it when necessary.

Do not let agents fail silently in production.

Do not build systems that rely solely on the LLM to figure everything out.

Do not design auditability as an afterthought; integrate it from the start.

Do not start with prompts; start with schemas for easier testing and reasoning.

Common Questions

An agentic system uses AI agents to process and make decisions based on unstructured documents. These systems aim for scalability and production readiness, incorporating elements like state management, retries, and intentional constraints on large language models.

Topics

AI & Machine Learning Technology & Innovation Business & Entrepreneurship Enterprise AI Large Language Models AI Architecture Agentic Systems Data Processing Document Extraction Production Systems

Mentioned in this video

Companies

Landing AI

Company developing an agentic document extraction platform for enterprises, particularly in regulated industries like financial services and healthcare.

GitHub

Platform where the demo was open-sourced, providing users with access to the code.

Anthropic

Company that introduced the concept of separating the brain and the hand in AI systems.

Locations

New York City

Location where David Park previously spoke at an AI dev day.

Software & Apps

FastAPI

A web framework used for the backend of the agentic system.

Excel

A file format that can be processed by the agentic document extraction platform.

AWS

An alternative cloud platform that could be used for agent orchestration, alongside Google ADK.

LangChain

A framework that can be used for agent orchestration, mentioned as an alternative to Google ADK.

CrewAI

An agent orchestration framework mentioned as an alternative to Google ADK.

Pydantic

A data validation library used with FastAPI for the backend.

PDF

A common document format that can be processed by the agentic document extraction platform.

PowerPoint

A document format that can be processed by the agentic document extraction platform.

Claude

LLM used for reasoning, interaction layer, explaining decisions, and enabling natural language interaction.

Python

Programming language used to implement the explicit business rules within the agentic system.

People

David Park

Speaker who heads up applied AI engineering at Landing AI, discussing building production-grade agentic systems.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free