Are embeddings still important in AI?

Yes, embeddings remain very important for representing data, especially multimodal data, and enabling similarity searches. However, they are just one component, and effective search requires more than just embeddings and cosine similarity, incorporating factors like freshness and authority.

Should I use PostgreSQL for vector search?

For many use cases at a reasonable scale, PostgreSQL with its pgvector extension is a viable option for vector search. If your business critically depends on search quality, especially at larger scales, a dedicated retrieval or search engine might be more suitable.

What's the recommended approach for building RAG applications?

Start by cleaning your data. A strong baseline is the classical BM25 keyword matching algorithm, then consider using an off-the-shelf embedding model with hybrid search capabilities. If feasible, add a re-ranking layer for further improvement.

Is it better to run ML models inside or outside the database?

Jo Kristian Bergum suggests keeping infrastructure and ML logic separate due to different scaling properties. He is not bullish on running extensive ML inference directly within the database, preferring more control over cost and performance via external systems.

Is RAG dead because vector databases are declining?

No, RAG (Retrieval Augmented Generation) is not dead. Augmenting AI with retrieval or search is still highly relevant for the long term. The decline is in the *vector database infrastructure category*, not the concept of RAG itself.

How do longer context windows affect RAG?

Longer context windows, like those in Gemini models, can handle more data directly, potentially negating the need for RAG in simpler cases (e.g., a single PDF). However, for very large datasets (millions of tokens), retrieval remains essential.

What are the challenges with Knowledge Graphs and Graph RAG?

The primary challenge is building the knowledge graph itself, which involves defining entities and relationships accurately. While graph databases can accelerate traversal, many people incorrectly assume a specific technology is needed, overlooking search engines for graph exploration.

Key Moments

The Rise and Fall of the Vector DB category: Jo Kristian Bergum (ex-Chief Scientist, Vespa)

Latent Space Podcast

Science & Technology3 min read28 min video

Apr 19, 2025|5,184 views|156|8

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

VectorDBs as a distinct category are declining, converging with traditional search.

Key Insights

The vector database category rapidly emerged due to the need for efficient vector search but is now declining as features converge.

Traditional search engines and existing databases are integrating vector search capabilities, reducing the need for specialized vector DBs.

Embeddings remain crucial for representing data, but their use extends beyond simple similarity search and is not exclusive to vector databases.

Search, not vector embeddings, is presented as the more natural abstraction for connecting AI with knowledge and for RAG applications.

Hybrid search combining keyword, semantic, and metadata filtering offers better results than pure embedding-based similarity.

The debate of long context vs. RAG is ongoing, with context windows expanding but large datasets still necessitating retrieval methods.

Building and maintaining knowledge graphs is the primary challenge for graph RAG, not merely the database technology used.

THE RISE AND FALL OF THE VECTOR DATABASE CATEGORY

The emergence of applications leveraging embeddings created a demand for specialized infrastructure to handle high-dimensional vector storage, indexing, and search. Companies like Pinecone pioneered this new category, positioning vector databases as essential for AI development, especially after ChatGPT's release. This led to rapid growth and significant investment, but the speaker argues the distinct category of 'vector databases' is now in decline, not the underlying technology or its applications.

CONVERGENCE AND THE INTEGRATION OF VECTOR CAPABILITIES

The core thesis is that specialized vector database infrastructure is becoming obsolete due to feature convergence. Existing traditional search engines like Elasticsearch and Vespa, as well as general-purpose databases like PostgreSQL (with extensions like pgvector), are increasingly incorporating robust vector search capabilities. This integration means developers can often leverage their existing data infrastructure instead of adopting a new, dedicated vector database solution, accelerating the decline of the standalone category.

EMBEDDINGS: CRUCIAL BUT NOT EXCLUSIVE TO VECTOR DBS

Embeddings are acknowledged as a vital tool for representing diverse data types and enabling semantic understanding. However, the speaker emphasizes that their importance does not inherently necessitate a dedicated vector database. Embeddings have moved from big tech research to mainstream use, but their application in similarity search is just one facet. More advanced search systems require additional signals like freshness and authority, moving beyond simple cosine similarity.

SEARCH AS THE NATURAL ABSTRACTION FOR AI

The speaker advocates for 'search' as a more natural and user-friendly abstraction for connecting AI models with knowledge, particularly for RAG applications. Instead of focusing on the technical implementation details like embeddings and vector spaces, the emphasis should be on the search interface. This allows AI agents to dynamically choose appropriate search methods (keyword, semantic, codebase, web) based on the task, treating vector search as just one component of a broader retrieval strategy.

THE NUANCES OF RETRIEVAL-AUGMENTED GENERATION (RAG)

While the vector database category may be declining, the principles behind RAG—augmenting AI with retrieval—remain highly relevant and are here to stay. The speaker clarifies that RAG is not dead, but rather the idea that it *must* rely on a dedicated vector database is flawed. For smaller datasets that fit within expanded context windows, traditional retrieval might be sufficient, reducing the need for complex vector indexing. The necessity and optimal implementation of RAG depend heavily on data volume, query load, and specific use cases.

HYBRID SEARCH AND THE MULTI-STAGE APPROACH

Effective search and RAG systems often employ a hybrid approach, combining keyword matching (like BM25) with semantic search via embeddings. Metadata filtering and re-ranking layers significantly improve result quality, moving beyond pure similarity metrics. Large-scale systems, such as those used in recommender engines, typically involve a cascade of retrieval and re-ranking stages, highlighting that embedding-based retrieval is a component of, not the entirety of, modern search solutions.

KNOWLEDGE GRAPHS AND FUTURE OPPORTUNITIES

The creation and maintenance of knowledge graphs present the primary challenge for 'graph RAG,' not the specific database technology. While graph databases excel at traversing relationships, building the graph itself is complex. With LLMs now capable of more easily generating entities and triplets, knowledge graphs may become more feasible. Opportunities also exist in developing more domain-specific embedding models (for legal, finance, health) and leveraging visual language models for richer data representations, though the business models for such specialized API services remain challenging.

Mentioned in This Episode

●Software & Apps

●Companies

●Concepts

●People Referenced

Common Questions

The vector database category is declining because its features are converging into existing database technologies and traditional search engines. Many databases now offer vector search capabilities, making a separate vector database infrastructure category unnecessary for many use cases.

Topics

AI & Machine Learning Technology & Innovation Programming & Software Data Infrastructure Retrieval-augmented Generation Vector Databases Search Engines LLM Applications Information Retrieval

Mentioned in this video

Software & Apps

Vespa

A search engine technology that Jo Kristian Bergum has experience with, mentioned as having vector search capabilities.

Chroma

A database that Jo Kristian Bergum was an angel investor in and helped with examples. It's seen as playing a role in promoting retrieval for AI.

Elasticsearch

A traditional search engine mentioned as having vector search capabilities, indicating a convergence in features.

PostgreSQL

A database system that offers vector search capabilities through extensions like pgvector, suggesting a convergence of database and search functionality.

PostgreSQL ML

A feature within PostgreSQL for running machine learning models alongside the database, which Jo Kristian Bergum is not bullish about.

Superbase

Mentioned in the context of their cron service, illustrating the tension between what lives in the database versus external systems.

LLaMA 4

Mentioned as a hypothetical future model that could reignite the long context vs. RAG debate, potentially resolving it.

Gemini

A model mentioned for its large context window, suggesting that for some use cases with a substantial number of articles, a vector database might not be necessary.

ChatGPT

The launch of ChatGPT in November 2022 significantly influenced the AI landscape, leading to increased interest in RAG and embeddings.

MongoDB

A successful NoSQL database company used as a benchmark for fundraising success; its fundraising total over its lifespan was less than half of what vector databases raised in a short period.

PGVector

An extension for PostgreSQL that adds vector search capabilities, highlighted as a strong contender and example of database convergence.

Embedding API

APIs provided by companies like OpenAI that made embeddings accessible to every developer, contributing to their mainstream adoption.

Postgres

A database mentioned as a viable option for transactional data and vector storage, especially with its pgvector extension.

Redis

Mentioned as a database company also pushing to offer vector search capabilities.

Companies

Pinecone

A pioneer in framing vector databases as a new infrastructure category. Mentioned as having a rapid rise and fall, and too narrow a focus to stick.

Turbopuffer

An upcoming company in the vector database space, mentioned as having similar models to Pinecone but different pricing, focusing on developers.

TikTok

Mentioned for its recommender systems which use embedding-based retrieval, and for recently publishing its RAG system details.

NVIDIA

Acquired Voyage AI, indicating an interest in the embedding technology space.

Gino

A prominent European startup in the RAG space, doing great work, especially with European languages.

OpenAI

Released a cookbook after ChatGPT's launch, suggesting developers connect ChatGPT with their data using embeddings, sparking the RAG trend.

Google

Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.

Yahoo

Mentioned as a company where Jo Kristian Bergum worked on search systems for 20 years, and also as a big tech company that has long worked on embeddings.

Facebook

Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.

Oracle

Mentioned as a company that has also tried to move logic into the database, similar to PostgreSQL ML.

Microsoft

Mentioned as one of the big tech companies that has long worked on embeddings for various tasks.

Concepts

BM25

A classical keyword matching algorithm for search that has been around for 30 years, considered a strong baseline for many search use cases.

Knowledge Graphs

A data structure representing entities and their relationships. Its complexity in building is a bottleneck, but LLMs might ease this process.

Graph RAG

Utilizing knowledge graphs within a RAG system. The discussion highlights the challenge of building the graph itself and the potential for hybrid approaches.

People

Andrew Ng

Mentioned indirectly through the phrase 'Hamal that always talks about look at your data,' likely 'Hamal' is a mishearing of Ng's name or someone he commonly associates with such advice.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free

The Rise and Fall of the Vector DB category: Jo Kristian Bergum (ex-Chief Scientist, Vespa)

Want to know something specific about what's covered?

Key Insights

THE RISE AND FALL OF THE VECTOR DATABASE CATEGORY

CONVERGENCE AND THE INTEGRATION OF VECTOR CAPABILITIES

EMBEDDINGS: CRUCIAL BUT NOT EXCLUSIVE TO VECTOR DBS

SEARCH AS THE NATURAL ABSTRACTION FOR AI

THE NUANCES OF RETRIEVAL-AUGMENTED GENERATION (RAG)

HYBRID SEARCH AND THE MULTI-STAGE APPROACH

KNOWLEDGE GRAPHS AND FUTURE OPPORTUNITIES

Mentioned in This Episode

Common Questions

Topics

Mentioned in this video

More from Latent Space

Why AI Agents Don't Actually Understand You — Danielle Perszyk, Amazon AGI Lab

Podcast Crossover: AIE, AGI, frontier lab strategy with ⁨@matthew_berman⁩ and @swyxtv

The 100,000 Sandbox Problem — Akshat Bubna, Modal CTO

🔬 "The Most Innovative Diffusion Research Is Happening in Drug Discovery, Not Image Generation"

Ask anything from this episode.

The Rise and Fall of the Vector DB category: Jo Kristian Bergum (ex-Chief Scientist, Vespa)

Want to know something specific about what's covered?

Key Insights

THE RISE AND FALL OF THE VECTOR DATABASE CATEGORY

CONVERGENCE AND THE INTEGRATION OF VECTOR CAPABILITIES

EMBEDDINGS: CRUCIAL BUT NOT EXCLUSIVE TO VECTOR DBS

SEARCH AS THE NATURAL ABSTRACTION FOR AI

THE NUANCES OF RETRIEVAL-AUGMENTED GENERATION (RAG)

HYBRID SEARCH AND THE MULTI-STAGE APPROACH

KNOWLEDGE GRAPHS AND FUTURE OPPORTUNITIES

Mentioned in This Episode

Common Questions

Topics

Mentioned in this video

More from Latent Space

Why AI Agents Don't Actually Understand You — Danielle Perszyk, Amazon AGI Lab

Podcast Crossover: AIE, AGI, frontier lab strategy with ​ ⁨@matthew_berman⁩ and @swyxtv

The 100,000 Sandbox Problem — Akshat Bubna, Modal CTO

🔬 "The Most Innovative Diffusion Research Is Happening in Drug Discovery, Not Image Generation"

Ask anything from this episode.

Podcast Crossover: AIE, AGI, frontier lab strategy with ⁨@matthew_berman⁩ and @swyxtv