AI Dev 25 x NYC | David Park: Impact of Agentic AI in Financial Services on Document Extraction
Key Moments
Agentic AI revolutionizes financial document extraction using multimodal models for accuracy and automation.
Key Insights
Agentic AI integrates language, vision, and reasoning for advanced document analysis.
Landing AI's platform (ADE) offers enterprise-grade multimodal document extraction.
ADE's foundation models are document-pretrained transformers optimized for complex layouts.
The platform parses documents into machine-readable formats (JSON, Markdown) with layout awareness.
Agentic reasoning allows for intelligent problem-solving in document processing.
ADE significantly reduces manual review in financial operations like KYC, improving efficiency and reducing risk.
INTRODUCTION TO LANDING AI AND AGENTIC DOCUMENT EXTRACTION (ADE)
David Park introduces Landing AI, a company founded by Dr. Andrew Ng, which has processed over a billion images and documents. The focus is on their product, Agentic Document Extraction (ADE), an enterprise-grade platform designed for multimodal document processing. ADE leverages specialized models and decision-making agents to extract insights from complex documents, moving beyond traditional OCR and rule-based methods. The platform is trusted by major organizations, particularly in financial services, hinting at its robust capabilities.
THE FOUNDATION AND PARSING LAYERS OF ADE
ADE is built on a strong foundation layer featuring document-pretrained transformers. These models are specifically trained for complex document layouts, understanding mixed content like tables, figures, and charts, extending to unstructured data such as Excel and PowerPoint. This ensures high accuracy before any higher-level reasoning is applied. The subsequent parsing layer converts documents into machine-readable formats like JSON and Markdown, preserving structure and relationships. This layer is layout-aware, differing from simple text recognition.
AGENTIC REASONING AND CUSTOMER-FACING CAPABILITIES
A key differentiator for ADE is its pioneering agentic reasoning technology in visual AI. This allows the system to apply logic, context, and problem-solving to parse content, handling visually rich and complex data beyond plain text. The agents and apps layer focuses on customer-facing capabilities, combining parsing and reasoning engines with specialized agents for tasks like field extraction and document classification. This enables automated workflows for various industry-specific needs, integrated into core products and processes.
CORE TECHNICAL DIFFERENTIATORS AND DATA STRATEGY
ADE's core strengths lie in its deep computer vision expertise, understanding spatial layouts and element relationships, unlike generic vision systems. Its document-native models provide high-precision localization, crucial for accurately identifying fields and tables even in noisy data. The platform orchestrates multimodal intelligence, blending vision, text, and structured data. Semantic visual reasoning goes beyond extraction to understand meaning and context, vital for high-stakes financial applications. High-fidelity, domain-specific datasets and continuous feedback loops ensure ongoing model improvement.
ENTERPRISE READINESS AND DEPLOYMENT FLEXIBILITY
Landing AI emphasizes its enterprise-first approach with ADE. The platform is built with enterprise IT standards, offering flexible deployment options including multi-tenant SaaS, virtual private link, and on-premise solutions. For compliance, ADE is HIPAA and GDPR compliant, offering zero data retention and a stateless architecture ensuring data privacy and ephemeral processing. This robust infrastructure supports demanding financial services requirements.
CASE STUDY: KYC OPERATIONS AUTOMATION
A significant real-world application of ADE is in Know Your Customer (KYC) operations for a major financial institution. This traditionally manual, document-heavy process involves thousands of analysts reviewing documents like bank statements and corporate governance papers. By integrating ADE, the extraction and validation of key fields were automated, resulting in over a 70% reduction in manual document review. This not only added significant value but also reduced operational risks and potential regulatory fines.
APPLICATIONS IN LOAN PROCESSING WORKFLOWS
The platform is demonstrated through a loan processing scenario where loan officers receive combined packets including pay stubs, W2s, bank statements, and IDs. ADE automates this by first classifying and splitting documents based on content. Then, schema-driven extraction pulls relevant fields for specific documents (e.g., income from pay stubs, deposits from bank statements). This process ensures accuracy, layout grounding, full traceability, and auditability, essential for compliance and enabling true agentic loan approval automation.
DEMONSTRATION OF EXTRACTION AND PARSING CAPABILITIES
The demo showcases ADE's ability to handle complex documents like investment account statements and pay stubs. It highlights accurate table rendering, understanding indentation for parent-child relationships, and parsing documents even without clear boundaries. The output includes chunk ID, type, page number, and crucially, visual grounding information, providing cell-level references to verify extraction accuracy and prevent LLM hallucinations. The API capabilities for parsing all document content and schema-driven field extraction are also detailed.
SCHEMA GENERATION AND CHATBOT INTERFACE
ADE offers flexible schema creation, including smart suggestions based on document content and natural language prompting to generate schemas. These schemas act like prompt engineering for each field, allowing for short or long descriptions, and even incorporating industry-specific acronyms. The platform includes a playground with a chatbot interface that allows users to visually interact with extracted information, asking questions about the documents and receiving granular localization and cell-level references for financial services.
NOTEBOOK DEMO: END-TO-END LOAN PROCESSING WORKFLOW
A detailed notebook demo illustrates the end-to-end loan processing workflow. It covers parsing the messy loan packet, classifying each page, and automatically splitting documents. Specific schemas are defined for different document types (pay stub, bank statement, investment statement), and the extracted information is mapped and visualized. The demo emphasizes how the system understands document structure, extracts granular data, and provides visualizations of the extracted fields and corresponding document chunks for analyst review.
LANDING AI BUILDER PROGRAM AND SUPPORT
Emily discusses the Landing AI Builder Program, designed to support developers and partners using ADE. The program offers early access, real support, onboarding assistance, and communication channels. It aims to help builders move fast and get their products launched successfully. This community focus is crucial for enabling developers to automate complex processes using ADE, often in a matter of hours, and for Landing AI to gather feedback and spotlight successful implementations.
HANDLING NEW DOCUMENT TYPES AND ACCURACY VALIDATION
Regarding new document types, ADE's document-pretrained models are zero-shot capable, meaning they can extract information accurately without specific templating or fine-tuning, even for unseen documents. For critical accuracy needs, Landing AI's product team can assist. Benchmark tests, such as on the DOCVQA dataset, show ADE achieving over 99% accuracy, significantly outperforming other state-of-the-art models. Validation utilizes highly curated datasets and continuous testing of different architectures.
INTEGRATION AND TRIGGERING FOLLOW-UP ACTIONS
While ADE's primary focus is on the challenging aspect of document extraction, it supports downstream automation through integrations. For triggering follow-up actions like issuing loans or filing documents, Landing AI relies on customers to build these workflows, leveraging partnerships with RPA providers, hyperscalers (AWS, Snowflake, Azure), and other enterprise applications. Integrations with Lambda functions and event-driven architectures facilitate these automated processes.
Mentioned in This Episode
●Software & Apps
●Companies
●Studies Cited
●Concepts
●People Referenced
Common Questions
ADE is a developer-first, enterprise-grade platform from Landing AI designed for multimodal document extraction. It uses advanced AI models to understand complex layouts, mixed content, and provide high-precision parsing and reasoning capabilities.
Topics
Mentioned in this video
A developer-first, enterprise-grade platform by Landing AI for multimodal document extraction.
A type of tax form document that can be part of a loan packet processed by ADE.
An open-source dataset for document visual question answering used for benchmarking ADE's accuracy.
Leads the applied AI engineering team at Landing AI and presents the talk on agentic AI in financial services.
Robotic Process Automation, mentioned as a system that ADE can integrate with for workflow automation.
The company that developed Agentic Document Extraction (ADE), founded by Andrew Ng.
More from DeepLearningAI
View all 65 summaries
1 minThe #1 Skill Employers Want in 2026
1 minThe truth about tech layoffs and AI..
2 minBuild and Train an LLM with JAX
1 minWhat should you learn next? #AI #deeplearning
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free