Key Moments
The Rise and Fall of the Vector DB category: Jo Kristian Bergum (ex-Chief Scientist, Vespa)
Key Moments
VectorDBs as a distinct category are declining, converging with traditional search.
Key Insights
The vector database category rapidly emerged due to the need for efficient vector search but is now declining as features converge.
Traditional search engines and existing databases are integrating vector search capabilities, reducing the need for specialized vector DBs.
Embeddings remain crucial for representing data, but their use extends beyond simple similarity search and is not exclusive to vector databases.
Search, not vector embeddings, is presented as the more natural abstraction for connecting AI with knowledge and for RAG applications.
Hybrid search combining keyword, semantic, and metadata filtering offers better results than pure embedding-based similarity.
The debate of long context vs. RAG is ongoing, with context windows expanding but large datasets still necessitating retrieval methods.
Building and maintaining knowledge graphs is the primary challenge for graph RAG, not merely the database technology used.
THE RISE AND FALL OF THE VECTOR DATABASE CATEGORY
The emergence of applications leveraging embeddings created a demand for specialized infrastructure to handle high-dimensional vector storage, indexing, and search. Companies like Pinecone pioneered this new category, positioning vector databases as essential for AI development, especially after ChatGPT's release. This led to rapid growth and significant investment, but the speaker argues the distinct category of 'vector databases' is now in decline, not the underlying technology or its applications.
CONVERGENCE AND THE INTEGRATION OF VECTOR CAPABILITIES
The core thesis is that specialized vector database infrastructure is becoming obsolete due to feature convergence. Existing traditional search engines like Elasticsearch and Vespa, as well as general-purpose databases like PostgreSQL (with extensions like pgvector), are increasingly incorporating robust vector search capabilities. This integration means developers can often leverage their existing data infrastructure instead of adopting a new, dedicated vector database solution, accelerating the decline of the standalone category.
EMBEDDINGS: CRUCIAL BUT NOT EXCLUSIVE TO VECTOR DBS
Embeddings are acknowledged as a vital tool for representing diverse data types and enabling semantic understanding. However, the speaker emphasizes that their importance does not inherently necessitate a dedicated vector database. Embeddings have moved from big tech research to mainstream use, but their application in similarity search is just one facet. More advanced search systems require additional signals like freshness and authority, moving beyond simple cosine similarity.
SEARCH AS THE NATURAL ABSTRACTION FOR AI
The speaker advocates for 'search' as a more natural and user-friendly abstraction for connecting AI models with knowledge, particularly for RAG applications. Instead of focusing on the technical implementation details like embeddings and vector spaces, the emphasis should be on the search interface. This allows AI agents to dynamically choose appropriate search methods (keyword, semantic, codebase, web) based on the task, treating vector search as just one component of a broader retrieval strategy.
THE NUANCES OF RETRIEVAL-AUGMENTED GENERATION (RAG)
While the vector database category may be declining, the principles behind RAG—augmenting AI with retrieval—remain highly relevant and are here to stay. The speaker clarifies that RAG is not dead, but rather the idea that it *must* rely on a dedicated vector database is flawed. For smaller datasets that fit within expanded context windows, traditional retrieval might be sufficient, reducing the need for complex vector indexing. The necessity and optimal implementation of RAG depend heavily on data volume, query load, and specific use cases.
HYBRID SEARCH AND THE MULTI-STAGE APPROACH
Effective search and RAG systems often employ a hybrid approach, combining keyword matching (like BM25) with semantic search via embeddings. Metadata filtering and re-ranking layers significantly improve result quality, moving beyond pure similarity metrics. Large-scale systems, such as those used in recommender engines, typically involve a cascade of retrieval and re-ranking stages, highlighting that embedding-based retrieval is a component of, not the entirety of, modern search solutions.
KNOWLEDGE GRAPHS AND FUTURE OPPORTUNITIES
The creation and maintenance of knowledge graphs present the primary challenge for 'graph RAG,' not the specific database technology. While graph databases excel at traversing relationships, building the graph itself is complex. With LLMs now capable of more easily generating entities and triplets, knowledge graphs may become more feasible. Opportunities also exist in developing more domain-specific embedding models (for legal, finance, health) and leveraging visual language models for richer data representations, though the business models for such specialized API services remain challenging.
Mentioned in This Episode
●Software & Apps
●Companies
●Concepts
●People Referenced
Common Questions
The vector database category is declining because its features are converging into existing database technologies and traditional search engines. Many databases now offer vector search capabilities, making a separate vector database infrastructure category unnecessary for many use cases.
Topics
Mentioned in this video
A search engine technology that Jo Kristian Bergum has experience with, mentioned as having vector search capabilities.
A database that Jo Kristian Bergum was an angel investor in and helped with examples. It's seen as playing a role in promoting retrieval for AI.
A traditional search engine mentioned as having vector search capabilities, indicating a convergence in features.
A database system that offers vector search capabilities through extensions like pgvector, suggesting a convergence of database and search functionality.
A feature within PostgreSQL for running machine learning models alongside the database, which Jo Kristian Bergum is not bullish about.
A model mentioned for its large context window, suggesting that for some use cases with a substantial number of articles, a vector database might not be necessary.
The launch of ChatGPT in November 2022 significantly influenced the AI landscape, leading to increased interest in RAG and embeddings.
An extension for PostgreSQL that adds vector search capabilities, highlighted as a strong contender and example of database convergence.
APIs provided by companies like OpenAI that made embeddings accessible to every developer, contributing to their mainstream adoption.
Mentioned as a database company also pushing to offer vector search capabilities.
A pioneer in framing vector databases as a new infrastructure category. Mentioned as having a rapid rise and fall, and too narrow a focus to stick.
An upcoming company in the vector database space, mentioned as having similar models to Pinecone but different pricing, focusing on developers.
Mentioned for its recommender systems which use embedding-based retrieval, and for recently publishing its RAG system details.
Acquired Voyage AI, indicating an interest in the embedding technology space.
A prominent European startup in the RAG space, doing great work, especially with European languages.
Released a cookbook after ChatGPT's launch, suggesting developers connect ChatGPT with their data using embeddings, sparking the RAG trend.
Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.
Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.
Mentioned as a company that has also tried to move logic into the database, similar to PostgreSQL ML.
Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.
A classical keyword matching algorithm for search that has been around for 30 years, considered a strong baseline for many search use cases.
A data structure representing entities and their relationships. Its complexity in building is a bottleneck, but LLMs might ease this process.
Utilizing knowledge graphs within a RAG system. The discussion highlights the challenge of building the graph itself and the potential for hybrid approaches.
Mentioned in the context of their cron service, illustrating the tension between what lives in the database versus external systems.
Mentioned as a hypothetical future model that could reignite the long context vs. RAG debate, potentially resolving it.
A successful NoSQL database company used as a benchmark for fundraising success; its fundraising total over its lifespan was less than half of what vector databases raised in a short period.
Mentioned as a company where Jo Kristian Bergum worked on search systems for 20 years, and also as a big tech company that has long worked on embeddings.
A database mentioned as a viable option for transactional data and vector storage, especially with its pgvector extension.
More from Latent Space
View all 87 summaries
86 minNVIDIA's AI Engineers: Brev, Dynamo and Agent Inference at Planetary Scale and "Speed of Light"
72 minCursor's Third Era: Cloud Agents — ft. Sam Whitmore, Jonas Nelle, Cursor
77 minWhy Every Agent Needs a Box — Aaron Levie, Box
42 min⚡️ Polsia: Solo Founder Tiny Team from 0 to 1m ARR in 1 month & the future of Self-Running Companies
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free