Key Moments
AI Dev 25 x NYC | Benjamin Han: Snowflake: A SQL Engine to Talk to Your Data
Key Moments
Snowflake's Cortex AI SQL integrates LLMs into databases, enabling natural language queries and advanced data analysis.
Key Insights
Cortex AI SQL transforms traditional SQL into an AI query language by embedding LLM capabilities directly into database operations.
It offers managed functions and low-level primitives, allowing users to process text, perform sentiment analysis, summarization, and multimodal analysis using natural language instructions.
The platform enhances efficiency and ease of use by allowing composition of AI functions with standard SQL syntax like joins, WHERE clauses, and GROUP BY.
Snowflake leverages its SQL query planner for significant performance optimizations, reducing LLM call volumes and execution times.
AI SQL provides inherent governance benefits, including role-based access control, budget constraints, and data encryption, inherited from the Snowflake platform.
The technology powers higher-level agentic features like Snowflake Intelligence, a chat-like interface for data interaction, which can be white-labeled and customized.
Performance optimizations include dynamic thresholding with proxy models and query rewrites to handle large-scale AI-driven data analysis tasks efficiently.
INTRODUCTION TO SNOWFLAKE'S AI OFFERINGS
Benjamin Han introduces Snowflake's Cortex AI SQL, a project launched about a year prior and expanded from five to sixty engineers. The core goal of Cortex AI SQL is to infuse generative AI features with the efficiency, ease of use, and trust benefits inherent in the full Snowflake SQL platform. This suite includes managed models for text processing, sentiment analysis, and summarization, supporting various input methods like SQL, Python, REST, and a no-code interface, Snowflake Studio.
THE AISQL ARCHITECTURE AND PRIMITIVES
At the foundation of Snowflake's AI strategy lie scalable primitives, including document processing for extraction, parsing, and embedding, culminating in Cortex AI SQL. This layer integrates Large Language Models (LLMs) directly into the database by introducing new primitive operators into the query language. This integration allows for seamless composition with existing SQL syntax, such as Common Table Expressions (CTEs), GROUP BY, joins, and WHERE clauses, while also benefiting from performance optimizations.
FUNCTIONALITY AND USE CASES OF AISQL
Cortex AI SQL transforms SQL into an AI query language. It offers low-level LLM calls via `AI_COMPLETE` for direct integration, enabling users to select models, provide prompts, and point to data. Beyond primitives, managed functions simplify tasks through natural language instructions. Examples include `AI_FILTER` for sentiment analysis on customer transcripts, multimodal analysis of images for product identification, and `GROUP BY AI` for summarizing large text sets or identifying top complaints, all composable within standard SQL.
DEMONSTRATION: TOPIC DISCOVERY IN CUSTOMER REVIEWS
A demonstration showcased a workflow for topic discovery on customer reviews. Initially, an AI function identifies broad categories. Subsequently, this output is refined by classifying reviews into these discovered topics, revealing themes like 'value for money' and 'material quality.' When outliers emerge tagged as 'other,' further AI analysis is performed to identify underlying reasons, such as 'lack of specificity' in reviews. This iterative process, involving a 'human in the loop,' allows for refining categories and filtering out uninformative feedback for better analysis.
PERFORMANCE OPTIMIZATIONS FOR AI QUERIES
Significant performance optimizations are crucial for making AI SQL practical. Traditional SQL query plans often minimize join costs, which can lead to numerous LLM calls. Cortex AI SQL optimizes by strategically minimizing the number of LLM calls. This includes executing CPU-bound filters first to reduce the dataset before applying expensive LLM filters. Techniques like dynamic thresholding with proxy models and query rewrites, transforming joins into classification tasks for smaller candidate sets, drastically reduce execution time and improve accuracy.
GOVERNANCE AND HIGHER-LEVEL AGENTS
Leveraging the Snowflake platform, Cortex AI SQL provides built-in governance features such as role-based access control, budget constraints, and data encryption, ensuring secure and controlled AI-driven data operations. These low-level primitives serve as building blocks for higher-level agentic systems. Snowflake Intelligence, an example of such a system, is a chat-like interface allowing users to interact with their data using natural language, with AI SQL powering its ability to understand queries and retrieve relevant information.
FUTURE DIRECTIONS AND WHITE-LABELING
The architecture supports future agentic features and integrations. Notably, Snowflake Intelligence is designed to be adaptable, offering white-labeling capabilities with customizable branding. The underlying REST API allows integration into front-end applications of choice, extending its utility beyond internal Snowflake users. Efficacy testing on natural language to SQL translation shows high accuracy, with Snowflake's offerings outperforming existing models in evaluations for specific products like Cortex Analyst.
Mentioned in This Episode
●Software & Apps
●Companies
●Organizations
●Studies Cited
●People Referenced
Common Questions
Cortex AI SQL is a Snowflake product that transforms SQL into an AI query language. Its main benefits include efficiency, ease of use, and trust by integrating generative AI features directly into the Snowflake SQL platform.
Topics
Mentioned in this video
A Snowflake model mentioned for tasks like custom summarization.
A low-level primitive within Cortex AI SQL that provides LLM calls into SQL queries.
AI functionality within Snowflake SQL for summarizing text data, demonstrated with academic papers.
A web interface for no-code development within Snowflake.
A SQL aggregate function that can be used with AI primitives for tasks like summarization and complaint analysis.
The team Benjamin Han works on at Snowflake, focused on integrating AI with SQL.
A managed function within Cortex AI SQL that allows natural language instructions for filtering data, including image analysis.
A REST API offered by Snowflake for developing custom AI agents.
Snowflake's first-party agent for interacting with company data, powered by Cortex AI SQL.
A production SQL engine that integrates generative AI features into the Snowflake SQL platform, offering efficiency, ease of use, and trust.
A Snowflake product for querying structured data using traditional SQL.
A Snowflake tool for unstructured file search from documents.
More from DeepLearningAI
View all 65 summaries
1 minThe #1 Skill Employers Want in 2026
1 minThe truth about tech layoffs and AI..
2 minBuild and Train an LLM with JAX
1 minWhat should you learn next? #AI #deeplearning
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free