How does Co-Scientist differ from traditional LLMs like ChatGPT or Gemini?

Unlike traditional LLMs that perform 'System 1' thinking (fast, intuitive responses), Co-Scientist engages in 'System 2' thinking (slower, deliberate, rigorous). It uses a multi-agent architecture for structured scientific reasoning, going beyond simple pattern matching.

What is the core technology behind Co-Scientist?

Co-Scientist utilizes advanced Gemini models within an agentic framework. It employs a loop of agents performing tasks like hypothesis generation, review, critique, ranking, and refinement, inspired by self-play and reinforcement learning techniques.

How does Co-Scientist generate scientific hypotheses?

Co-Scientist uses a team of agents that engage in simulated scientific debates. These agents continuously generate, refine, review, and critique hypotheses, incorporating new knowledge and reward signals to improve over time.

Can Co-Scientist handle complex, multi-step scientific problems?

Yes, Co-Scientist is designed for complex tasks over long time horizons. It can break down problems, explore different directions, and even suggest experimental protocols or validate existing hypotheses, as demonstrated in several case studies.

How are the hypotheses generated by Co-Scientist validated?

Validation involves real-world laboratory experiments conducted by human scientists. Examples include testing drug repurposing hypotheses for cancer, validating epigenomic targets for liver fibrosis, and examining gene interactions in various diseases.

What is the role of the human scientist when using Co-Scientist?

The human scientist remains in the driver's seat, guiding the system by defining research goals, providing context, and interpreting the AI's output. The goal is a collaborative partnership where AI accelerates discovery and humans provide expertise and critical judgment.

How does Co-Scientist ensure the quality and novelty of its hypotheses?

The system uses a ranking agent that orchestrates debates between hypotheses and computes Elo scores, similar to chess tournaments. This helps prioritize the most promising and novel ideas, while also maintaining epistemic humility by indicating confidence levels and uncertainties.

Can Co-Scientist discover entirely new scientific concepts or breakthroughs?

The system has demonstrated the ability to make novel connections, such as identifying a new potato immunoprotein, suggesting unexpected drug repurposing candidates, and filling gaps in known disease mechanisms, potentially leading to breakthroughs.

What are the safety mechanisms in Co-Scientist?

Co-Scientist employs multi-layered safety checks early in the prompt definition and continuously monitors generated ideas to prevent nefarious or unsafe research directions, halting computation if safety thresholds are breached.

How does Co-Scientist handle knowledge cutoffs or predict future events?

While isolating exact knowledge cutoffs is challenging due to data leakage, Co-Scientist can be used to predict outcomes of ongoing clinical trials with reasonable success, leveraging available knowledge, though proprietary data can be a limitation.

What are the implications of AI for scientific publishing and peer review?

The flood of AI-generated papers and the use of AI in peer review present challenges. Developing better mechanisms for sharing discoveries and standardizing scientific data formats is crucial, alongside careful oversight to avoid bias in AI-filtered reviews.

Key Moments

Stanford CS25: Transformers United V6 I Advancing Science and Medicine with Collaborative AI Agents

Stanford Online

Education5 min read67 min video

May 27, 2026|3,271 views|87|7

Stanford Stanford Online Transformers AI Artificial Intelligence

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

AI agents can now generate novel scientific hypotheses, but their current capabilities are largely system-one thinking; to achieve scientific breakthroughs, we need system-two thinking AI.

Key Insights

The co-scientist AI system, utilizing a multi-agent Gemini-based architecture, assists researchers by systematically generating and refining novel hypotheses for complex scientific challenges.

While current large language models excel at system-one thinking (fast, intuitive correlations), true scientific discovery requires system-two thinking (slow, deliberate, rigorous analysis).

AlphaFold, while revolutionary, is a specialized system, highlighting the need for generality in AI for true scientific superintelligence.

The co-scientist system employs a self-play, self-debate mechanism inspired by AlphaGo and AlphaZero, where agents refine hypotheses through continuous peer review and critique.

A validated hypothesis from the co-scientist system led to a peer-reviewed publication in Advanced Science, demonstrating its potential in real-world research.

The system has shown promise in identifying new drug repurposing candidates for cancers and novel epigenomic targets for liver fibrosis.

AI as a collaborative partner for scientific and medical experts

The core mission is to develop general-purpose AI systems that act as collaborative partners for scientists and doctors. This involves accelerating the pace of scientific discovery and democratizing medical expertise. The co-scientist project, a multi-agent Gemini-based system, aims to assist researchers by systematically generating and refining novel hypotheses for complex scientific challenges. This initiative builds upon earlier work, such as Med-PaLM, an early medically tuned large language model that achieved passing and expert-level scores on US medical license exams, underscoring the potential of AI in specialized domains.

From passing medical exams to hypothesis generation

The genesis of the co-scientist project stemmed from a realization that while LLMs like Med-PaLM were adept at tasks like question answering and summarization, their potential for hypothesis generation was largely untapped. Initially, the idea was met with skepticism due to the perceived limitations of LLMs, particularly their tendency towards hallucinations and surface-level correlations, characteristic of 'system-one' thinking. However, the project moved forward with the understanding that scientific discovery often requires a more deliberate and rigorous 'system-two' thinking process, which the team aimed to imbue into AI systems.

Defining and pursuing artificial general scientific intelligence

The pursuit of a truly superintelligent scientific AI requires generality – the ability to tackle a wide range of problems. Unlike specialized models like AlphaFold, which excels at protein structure prediction but cannot address other scientific questions, a general AI should understand and make progress on diverse challenges. This generality is akin to the human brain's remarkable ability to engage in various cognitive tasks, from language and art to science and philosophy. The key building block for this generality is argued to be natural language, as evidenced by the broad capabilities of current LLMs like Gemini, Claude, and GPT. While these models demonstrate generality, their application to complex scientific tasks like hypothesis generation is still in its early stages.

The self-play and self-debate mechanism for hypothesis refinement

The co-scientist system adopts an approach inspired by DeepMind's success with AlphaGo and AlphaZero, utilizing a self-play and reinforcement learning framework. Instead of games, however, the environment consists of scientific problems. The core mechanism involves a team of agents engaging in continuous 'scientific debates' and 'self-debates.' These agents generate, refine, review, critique, and rank hypotheses. This multi-agent setup, powered by advanced Gemini models, simulates a structured, rigorous thinking process that mirrors aspects of human scientific thought, aiming for a more deliberate and effective approach to hypothesis generation.

Architectural overview of the co-scientist system

Co-scientist functions as a general-purpose multi-agent system for scientific discovery. The human scientist remains in the driver's seat, guiding the system through natural language prompts specifying research goals, constraints, and preferences. This input forms the context for the AI's computation, which can dynamically proceed for minutes, hours, or even days. The output is a research report containing hypotheses or solutions. Internally, the system operates on a loop with four primary functions: generating ideas, reviewing them, ranking them, and evolving them. Each agent is configured with specific strategies, drawing from a 'library of strategies' that can be inspired by expert human thinking and refined over time. This architecture allows for a continuous feedback loop and self-improvement.

The role of the ranking agent and epistemic humility

A crucial component of the co-scientist system is the ranking agent, which uses a debate mechanism to prioritize hypotheses. This is vital because experts often have more ideas than resources, making it essential to surface only the most promising ones. The system computes Elo scores for hypotheses, similar to competitive games, to rank them based on defined criteria. Furthermore, the system emphasizes 'epistemic humility,' conveying its confidence levels and identifying key uncertainties. This ensures that the generated hypotheses are presented with appropriate context, allowing scientists to focus their attention effectively and guiding future research directions.

Validation through real-world scientific discovery examples

The efficacy of the co-scientist system has been demonstrated through several validation studies. In one instance, the system generated hypotheses that closely mirrored a significant, yet unpublished, discovery by researchers at Imperial College London concerning antimicrobial resistance, leading to a peer-reviewed publication. Other examples include identifying new drug repurposing candidates for acute myeloid leukemia, discovering novel epigenomic targets for liver fibrosis, and even designing de novo proteins with specific activities, sometimes integrating tools like AlphaFold. These case studies highlight the system's ability to contribute to genuine scientific breakthroughs across various domains.

Bridging human expertise with AI's broad exploration

The co-scientist system facilitates a powerful synergy between human expertise and AI-driven exploration. For instance, when identifying potential treatments for liver fibrosis, the AI suggested drugs from a cancer research context, a connection a human liver expert might not readily make. Similarly, in analyzing complex data like protein structures, the AI can identify novel patterns that lead to discovering previously unknown biological entities, such as a massive potato immune protein. This complementarity, where AI offers breadth and unexpected connections, and humans provide deep domain expertise for validation and judgment, represents a new paradigm for AI-human scientific collaboration.

Mentioned in This Episode

●Software & Apps

●Companies

●Organizations

●Books

●Drugs & Medications

●Concepts

●People Referenced

Co-Scientist: Enhancing Scientific Discovery

Practical takeaways from this episode

Do This

Define clear research goals with sufficient detail for the AI.

Provide constraints, rubrics, and preferences to guide hypothesis generation.

Leverage multimodal data like PDFs and experimental results as context.

Utilize the AI's ability to explore diverse scientific domains and make novel connections.

Engage with the detailed reports, focusing on prioritized hypotheses.

Refine the problem or break it down if the AI gets stuck, especially in mathematical problems.

Implement layered safety checks for research prompts and generated ideas.

Collaborate with the AI to leverage its broad search capabilities and your deep expertise.

Avoid This

Do not expect the AI to solve problems beyond current scientific knowledge (e.g., time machines).

Do not rely solely on standard LLMs for complex, structured scientific thinking; agentic systems are more robust.

Do not simply generate many ideas; prioritize and validate them to respect expert time.

Do not ignore the AI's uncertainty estimations or confidence levels.

Do not underestimate the potential for unexpected connections between different scientific fields.

Common Questions

Co-Scientist is an AI system designed to act as a collaborative partner for scientists. It uses a multi-agent approach to generate, critique, rank, and refine scientific hypotheses, aiming to accelerate discovery and provide novel insights.

Topics

Ai Agents AI & Machine Learning Technology & Innovation Science & Mathematics Drug Discovery Protein Design Scientific Discovery Multi-Agent Systems Hypothesis Generation Computational Biology Biomedical Research

Mentioned in this video

Organizations

Google DeepMind

Research laboratory where Vive works, focusing on AI in science and medicine.

Project Amy

A research program co-led by Vive aiming to build and democratize medical superintelligence.

Facebook AI Research

The research division Vive worked at previously, focusing on multimodal assistant systems.

Harvard T.H. Chan School of Public Health

An institution where Vive is on the faculty for executive education.

Houston Methodist Hospital

A hospital in Texas where physician scientists used Co-Scientist for cancer drug discovery.

Mass General Hospital

A hospital where researchers studied the link between ACE inhibitors and the risk of Alzheimer's.

Software & Apps

MedPM

An early AI system developed by Vive and his team that achieved passing scores on the US medical license exam.

MedPM 2

A subsequent version of the MedPM system that achieved expert-level scores on the US medical license exam.

Co-Scientist

An AI system designed to act as a collaborative partner for scientists, aiming to accelerate discovery.

PaLM

A precursor model to Gemini, used in early experiments for hypothesis generation.

Gemini

Google's advanced AI model, mentioned as the successor to PaLM and a current tool for various tasks.

ChatGPT

A popular AI chatbot, mentioned in comparison to Gemini.

AlphaFold

A highly specialized AI model for predicting protein structures, cited as an example of capability but not generality.

AlphaGo

An AI program that mastered the game of Go, used as an example of self-play and reinforcement learning.

Deep Blue

An AI chess-playing computer that competed against Garry Kasparov in 1999.

AlphaZero

An advanced version of AlphaGo that demonstrated the power of self-play and reinforcement learning in complex environments.

AlphaStar

An AI system developed by DeepMind that achieved high performance in complex strategy games like StarCraft II.

Gemini models

Google's latest AI models, used to power the Co-Scientist agents, possessing long context, multimodal, and agentic tool-use capabilities.

B2R receptor

A receptor on brain cells triggered by bradinin, leading to neurodegeneration.

DHX9

A gene identified by Co-Scientist as potentially linking neurodegenerative diseases with small cell lung cancer.

SRRM4

A gene identified by Co-Scientist as potentially linking neurodegenerative diseases with small cell lung cancer.

arXiv

A preprint server that introduced a new policy regarding hallucinated references in submitted papers.

Books

Nature

A scientific journal where the MedPM paper was published in 2023.

Advanced Science

A journal where a peer-reviewed paper based on an AI-generated hypothesis was published.

People

Gary Pel

A professor at Stanford who suggested using LLMs for hypothesis generation.

Garry Kasparov

World chess champion, famously competed against Deep Blue, highlighted for his general intelligence.

Demis Hassabis

CEO of DeepMind, who commented on Gary Kasparov's general intelligence compared to Deep Blue.

David Doish

Mentioned as a source of inspiration for developing strategies for scientific idea generation through podcasts.

Imperial researchers

Researchers at Imperial College who had a breakthrough in antimicrobial resistance and collaborated on testing Co-Scientist.

Shinya Yamanaka

Nobel laureate recognized for discovering Yamanaka factors, used for cell rejuvenation.

Filippo Bellajia

A professor in Germany who validated the Co-Scientist's hypothesis regarding DHX9 and SRRM4 in small cell lung cancer.

Media

Thinking in Boxes

A documentary chronicling the history of DeepMind, mentioned in the context of AI generality.

Companies

DeepMind

AI research lab, founders of AlphaGo and AlphaFold, whose history is chronicled in 'Thinking in Boxes'.

OpenAI

The company behind ChatGPT, mentioned in relation to its potential integration or collaboration with Google's systems.

Concepts

Yamanaka factors

A set of transcription factors discovered by Shinya Yamanaka that can rejuvenate cells and were used as a basis for designing improved proteins.

Drugs & Medications

ACE inhibitors

Commonly used blood pressure medication found to have a potentially heightened risk for Alzheimer's disease.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free