How does Deep Research differ from standard Google Search?

While standard Google Search is ideal for specific, known queries, Deep Research excels when a question has multiple facets or requires synthesizing information from various sources. It's designed for journeys where users might otherwise open dozens of tabs and feel overwhelmed.

Can users influence the research plan in Deep Research?

Yes, Deep Research presents a research plan for user review, allowing them to edit or add steps. This interactive process helps steer the research and ensures it aligns with the user's specific goals, similar to how an intern would clarify a task.

How does Deep Research browse the web and process information?

Behind the scenes, Deep Research uses tools to perform searches and explore web pages. It can parallelize substeps and reason based on information found in previous turns, iteratively building the report until all steps are completed.

What kind of insights can be gained from a Deep Research report?

Beyond just factual information, Deep Research can identify deeper insights, such as differing philosophies between regulatory bodies (e.g., the US's reactive vs. the EU's precautionary approach to food safety).

What are the different types of follow-up questions users can ask after a Deep Research report?

Users can ask for missing factoids that were already captured during browsing, initiate new deep research on related topics (like comparing different regions), or ask to modify the existing report by condensing or adding sections.

Can Deep Research handle complex queries that might exceed its context window?

Deep Research utilizes advanced models with large context windows. For extremely long interactions, it has retrieval mechanisms (RAG setup) if the information cannot be natively held in context, though the preference is to use the long context when possible.

How is Deep Research evaluated for performance and quality?

Evaluation involves both automated metrics (e.g., time spent on planning, number of iterative steps) and extensive human evaluations. The human review focuses on comprehensiveness, completeness, and groundedness, guided by an 'ontology of use cases' to cover various research patterns.

Are there trade-offs between speed and thoroughness in Deep Research?

Yes, there's a trade-off. While faster responses are typical for Google products, Deep Research's value may increase with longer processing times that allow for more exploration and verification. Users sometimes perceive longer wait times as indicative of more thorough work.

What are the future directions for agents like Deep Research?

Future developments include enhanced personalization based on user knowledge level, multimodality (charts, images, etc., in outputs), better integration with private data sources, and improved agents that can discover new ideas rather than just summarizing the web.

Should users be able to adjust the amount of verification versus search time in Deep Research?

While explicit toggles are discussed, the ideal scenario may be for the model to intelligently infer the user's need for accuracy versus exploration based on the query and conversation history, though this is still an area of research.

What is the biggest challenge in developing advanced research agents?

A key challenge is enabling agents to not just retrieve and summarize information but to discover genuinely new ideas or hypotheses. This requires advanced reasoning capabilities and often needs environments where these hypotheses can be tested and verified within specific domains.

Key Moments

Why is everyone cloning Deep Research?

Latent Space Podcast

Science & Technology5 min read61 min video

Feb 18, 2025|4,324 views|93|7

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

Gemini's Deep Research agent is a personal research assistant that synthesizes web information into detailed reports, allowing for iterative refinement and follow-up questions.

Key Insights

Deep Research acts as a personal AI research assistant, capable of synthesizing web information into comprehensive reports within minutes.

The tool generates a research plan that users can review and edit, allowing for steerability and transparency in the research process.

It employs a multi-step process involving parallel web browsing, information synthesis, and iterative refinement, including self-critique.

While built on Gemini 1.5 Pro, Deep Research involves specialized post-training and orchestration to ensure consistent and reliable performance for research tasks.

The feature allows for conversational follow-up questions to fetch missing information, initiate new deep research, or modify the existing report.

Evaluation of Deep Research involves a mix of automated metrics and human evals, focusing on dimensions like comprehensiveness, completeness, and groundedness across diverse use case ontologies.

THE CORE FUNCTIONALITY OF DEEP RESEARCH

Gemini's Deep Research is designed to function as a personal AI research assistant. Its primary purpose is to help users gain deep understanding of any topic quickly, transforming a query from zero to a significant level of knowledge in a short period. The process involves the agent browsing the web for approximately five minutes to gather information, after which it outputs a detailed research report. Users can then review this report and ask follow-up questions to further explore the topic. This approach tackles the challenge of overwhelming information by providing a structured, synthesized output.

THE RESEARCH PLAN AND USER STEERING

A key innovative aspect of Deep Research is its initial generation of a research plan. This plan outlines the agent's strategy for tackling the user's query, offering a blueprint of the intended research steps. Users are given the opportunity to review this plan and make edits, thereby steering the direction of the research. This feature addresses complex queries by breaking them down into manageable facets and allows users to refine the scope, ensuring the research aligns with their specific needs and provides a more personalized outcome.

THE WEB BROWSING AND INFORMATION SYNTHESIS PROCESS

Behind the scenes, Deep Research utilizes a sophisticated process to gather and synthesize information. The agent identifies parallelizable substeps within the research plan, employing tools for web searches and in-depth page analysis. Crucially, it reasons iteratively, using information from previous turns to decide on subsequent actions, such as cross-referencing information across different sources like the EU Commission and FDA. This iterative process continues until all research steps are completed, leading into an analysis mode where the model drafts and refines the report, including self-critiquing to ensure quality.

TECHNICAL IMPLEMENTATION AND MODELING

Deep Research is built upon Gemini 1.5 Pro, but it involves custom post-training and orchestration to achieve its specialized capabilities. The developers have focused on creating a responsive system that can handle complex, multi-turn research tasks. Challenges related to context window management and retrieval-augmented generation (RAG) have been addressed, with a preference for keeping recent and relevant research tasks within the model's long context for faster, more complex comparisons. The system aims for a balance between leveraging the model's internal knowledge and grounding information in external sources.

USER INTERACTION AND ITERATIVE REFINEMENT

The user experience is designed to facilitate ongoing interaction and refinement. After the initial report is generated, users can ask follow-up questions to pull in missing facts, initiate entirely new deep research projects, or modify the existing report. The side-by-side interface, displaying the document and chat, supports this iterative process. The system preserves the context of all browsed sites, allowing it to quickly fetch information that was found but not initially included, or to branch off into new research areas based on user prompts.

EVALUATION, USE CASES, AND FUTURE DIRECTIONS

Evaluating Deep Research involves both automated metrics and human judgment, focusing on comprehensiveness, completeness, and groundedness. The team has developed an ontology of use cases, categorizing research behaviors from broad and shallow to specific and deep, to ensure robust evaluation across diverse user needs. Future directions include enhanced personalization based on user knowledge and learning journey, multimodal outputs (charts, maps, images), and integration with private data sources like personal documents and subscriptions, aiming to broaden the agent's applicability beyond the open web.

THE BALANCING ACT: LATENCY VS. DEPTH

A significant challenge in developing tools like Deep Research is balancing latency with the perceived value of in-depth analysis. Initially, there was a concern that users might prefer faster, albeit less comprehensive, results. However, testing revealed that users often value the perceived effort and thoroughness, even if it means a longer processing time. The system operates within a five-minute research window, with a hard stop to prevent excessive delays, yet it also faces the inverse challenge of potentially encouraging longer processing times for perceived higher quality, a counterintuitive dynamic compared to many other AI products.

INTEGRATION WITH GOOGLE ECOSYSTEM AND EXTERNAL TOOLS

The Deep Research feature is being integrated into the broader Google ecosystem, with capabilities like exporting reports to formats compatible with tools like Google Docs. While distinct from Gemini Extensions, which allow Gemini to fetch content from other Google services (like Gmail or Calendar), Deep Research focuses purely on synthesizing web-based information. The developers aim to make Deep Research a seamless part of a user's workflow, enhancing productivity by providing deeply researched insights directly within their existing digital environment.

THE TECHNICAL ARCHITECTURE FOR AUTONOMOUS AGENTS

The underlying technical infrastructure for Deep Research is an asynchronous platform designed to handle multi-minute jobs and potential failures. This orchestration system maintains state, manages retries, and ensures the research journey continues even if interrupted. Unlike synchronous chat interactions, this asynchronous approach allows users to leave and return to their research sessions. The platform is built for flexibility, capable of modeling complex agent behaviors and supporting numerous LLM calls, providing a robust backbone for autonomous research tasks that span longer durations.

BENCHMARKING AND THE QUEST FOR NOVEL DISCOVERY

While benchmarks are valuable for industrial progress and motivating researchers, Deep Research emphasizes solving real user problems over optimizing for specific benchmark scores, as many benchmarks may not directly translate to a superior product experience. A key evolving area is the agent's ability to not just summarize but to discover genuinely new ideas. The development of 'thinking' models with enhanced reasoning and self-critique capabilities is crucial. However, verifying novel hypotheses remains a challenge, especially in domains lacking established verification environments or synthetic playgrounds.

Mentioned in This Episode

●Software & Apps

●Companies

●Organizations

●People Referenced

Common Questions

Deep Research is a feature within Gemini that acts as a personal research assistant. It takes a user's query, browses the web for approximately five minutes, and then generates a comprehensive research report for the user to review and ask follow-up questions.

Topics

Ai Agents AI & Machine Learning Technology & Innovation Science & Mathematics Future Of AI Generative AI Large Language Models AI Development Information Retrieval Research Tools

Mentioned in this video

Locations

United States

Mentioned in the context of comparing milk and meat regulations with Europe.

Europe

Mentioned in the context of comparing milk and meat regulations with the US, highlighting a precautionary approach to regulation.

Software & Apps

Deep research

A Google feature where Gemini acts as a personal research assistant, browsing the web for about 5 minutes to output a research report and answer follow-up questions. It aims to help users quickly learn about new topics with multiple facets.

NotebookLM

A previous Google AI product discussed as an inspiration and collaborator, focusing on providing a perfect IDE for working with documents and asking questions.

Gemini API

The API for the Gemini model, questioned by a host regarding whether Deep Research could be replicated using it.

Gemini 1.5 Pro

The specific version of Gemini powering Deep Research, with a discussion about its capabilities and potential special editions.

Google Assistant

Mentioned as a previous product where users enjoyed certain functionalities now being explored in Gemini extensions.

Gemini

Google's AI model that powers Deep Research, acting as a personal research assistant. It takes user queries, browses the web, generates a research plan, and outputs a comprehensive report.

Gemma

Google's open-source models that can be fine-tuned, mentioned in the context of replicating Deep Research functionality.

Devon

An agent mentioned by a host as a preferred user experience model, where the plan is visible and can be updated interactively during execution.

Google Docs

A Google product mentioned as a place for direct editing and exporting, integrating with Gemini's side panel.

Apache Airflow

A workflow management platform mentioned in the context of orchestration tools.

AWS Step Functions

A serverless orchestration service from Amazon Web Services, compared to Deep Research's internal tools.

Organizations

The European Union, discussed in relation to its precautionary approach to food regulation, contrasting with the US reactive approach.

FDA

Mentioned in the context of how the US regulates food additives, contrasted with the EU's commission.

Companies

OpenAI

A competitor company whose model routing feature and marketing approach are discussed in comparison to Google's Deep Research.

Spotify

A third-party integration mentioned as a potential use case for Gemini extensions, though less relevant for Deep Research itself.

Anthropic

A company mentioned as having a possible preview of a model routing feature, similar to what OpenAI might offer.

Temporal

A workflow management platform mentioned in the context of orchestration tools.

People

Kobe Bryant

Mentioned as an example of a potentially irrelevant benchmark that doesn't reflect real-world user experience.

Jason Calacanis

A podcast host mentioned for his feedback on Deep Research, suggesting that users might prefer waiting for a slower, more thorough response.

Brett Taylor

Mentioned as someone from Sierra who stated they built most of their product in-house.

Media

Lenspace Podcast

The podcast where this discussion is taking place, featuring Alesio and Swix as hosts and Aroua Mukun as a guest.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free

Why is everyone cloning Deep Research?

Want to know something specific about what's covered?