What are the limitations of current surgical robots like the da Vinci system?

Current surgical robots are predominantly teleoperated, meaning they lack inherent skill and rely entirely on the surgeon. While they enhance precision, they do not address the critical shortage of medical personnel or allow for performing more surgeries.

Why is surgical robotics a difficult domain for current AI and foundation models?

Surgical robotics presents challenges like scarce data, uncontrolled and non-resettable environments, extremely high safety requirements, few demonstrators, and privacy concerns around patient data, which are absent in cleaner, low-stakes domains where current foundation models have been developed.

What are the four foundational pillars of autonomous robots?

The four foundational pillars are perception (seeing and feeling the world), modeling and simulation (understanding physics and how the world evolves), planning (making decisions based on world models), and control (executing actions precisely).

What are the key challenges for robot perception in surgical environments?

Surgical environments pose unique perception challenges including narrow field of view, poor depth measurements, specular reflections from fluids, smoke, constant occlusion by instruments, deformable anatomy, difficulty finding static anchor points, and insufficient data for training modern neural models.

How can robots reconstruct deformable environments for surgery?

Techniques like deformable reconstruction from multiple cameras and scene fusion allow robots to build a 3D understanding of the surgical site, even as anatomy is manipulated. This is crucial for tasks like tumor excision with margin awareness and for understanding physics.

What is 'digital twinning' in the context of surgical robotics?

Digital twinning involves creating a physics-based simulation of deformable tissue that can run in real-time. This allows robots to simulate potential interactions and consequences before acting, informing model-based control and improving accuracy through techniques like differentiable rendering.

What is the value proposition of humanoid robots in healthcare?

Humanoid robots offer a versatile platform that can potentially perform a wide range of tasks currently handled by multiple specialized robots or human assistants, such as remote surgery, autonomous assistance, and nursing tasks, potentially at a lower cost than multiple dedicated systems.

Why is teleoperating a robot hand difficult, and how can it be improved?

Teleoperating robot hands is challenging due to limited degrees of freedom, size mismatches, and critically, limited tactile sensing. Improvements involve using reinforcement learning to teach robots to grasp instruments autonomously and developing advanced haptic feedback systems.

What are the main challenges in robotics training data collection?

A significant bottleneck is capturing robust force and tactile data. This is hampered by limited interfaces for sensing and material science limitations in tactile sensors (sensitivity vs. robustness trade-off), making it a long-term challenge.

How can robots avoid tearing tissue during manipulation?

Robots can incorporate metrics related to pulling force and tissue stretching to define safe boundary conditions. By integrating this into control systems, they can predict and avoid excessive strain that might lead to tearing, informed by both simulation and visual feedback.

Key Moments

Stanford Robotics Seminar ENGR319 | Spring 2026 | Unlocking Autonomous Medical Robotics

Q: What is knowledge grounded reinforcement learning?

It's an approach that combines engineered or learned behaviors into 'knowledge modules' within a neural network. Through reinforcement learning, the system learns how to sequence and combine these modules to achieve longer, more complex tasks, enabling lifelong learning.

Stanford Online

Education7 min read63 min video

May 12, 2026|1,688 views|44|3

Stanford Stanford Online Robotics

Save to Pod

Key Moments

On this page

TL;DR

Autonomous surgical robots are being developed to address surgeon shortages, but current systems lack inherent skill, and advancements in AI are hindered by surgery's inherent complexity and safety demands.

Key Insights

The US faces a shortage of tens of thousands of surgeons and hundreds of thousands of nurses, creating a significant need for robotic assistance in healthcare.

Currently, all deployed surgical robots are teleoperated, meaning they possess zero inherent skill and rely entirely on the surgeon's expertise.

General robot autonomy models, like vision-language-action models, struggle in surgical settings due to scarce data, uncontrollable environments, and critical safety requirements.

A viable path to autonomous surgical robots involves building context awareness through advanced perception, incorporating multiple knowledge cues for lifelong learning, and finding common embodiments for large-scale deployment.

Developing autonomous surgical robots requires new hardware, such as multi-directional haptic gloves enabling precise force feedback, alongside sophisticated software for tasks like chronic wound care and autonomous bandage application.

Humanoid robots like 'Surgi' show potential for diverse healthcare roles beyond surgery, from remote avatars to autonomous nursing assistants, offering a more scalable solution than specialized robots.

The pressing need for surgical robots amid healthcare worker shortages

The healthcare industry faces a critical shortage of skilled labor, with tens of thousands of surgeons and hundreds of thousands of nurses lacking to meet patient demands. Robots offer a compelling solution due to their tireless nature, 24/7 availability, and potential for precision exceeding human capabilities. Unlike traditional linear training models, robots can be updated fleet-wide, akin to autonomous vehicles, enabling programmable and uniform expertise across multiple platforms. This scalability is crucial for addressing the growing global need for medical care.

Limitations of current teleoperated surgical robots

Existing surgical robots, such as the well-established da Vinci system, are primarily teleoperated. In these systems, the surgeon controls the robot's instruments directly, typically through a joystick console. While these robots enhance surgeon precision and reduce fatigue, they do not address the fundamental issue of skilled labor shortages. In fact, robotic surgery often requires a larger support team than traditional surgery. Furthermore, these systems possess zero inherent skill, with all operational capability derived from the human surgeon.

Challenges in applying general AI to surgical autonomy

Recent advancements in general robot autonomy, particularly large foundation models and vision-language-action models, excel in controlled environments with abundant data and minimal risk. However, surgery presents a starkly different scenario. Key challenges include scarce and private data, uncontrollable and non-resettable environments, and paramount safety requirements. The limited number of expert surgeons available for demonstrations, the low incentive for data collection during patient care, and the absence of mature world models capable of accurately simulating surgical complexities make direct application of these general AI techniques difficult and risky. These models often perform in the mid-to-high 90% accuracy range in simpler tasks, but surgical environments demand near-perfect reliability.

A structured approach to surgical perception and context awareness

Achieving surgical autonomy hinges on robust perception and context awareness. This involves precise robot proprioception (knowing instrument positions), accurate object localization, detailed 3D scene reconstruction, and scene fusion to build a comprehensive environmental understanding. Surgical environments are exceptionally challenging for perception systems due to narrow fields of view, poor depth sensing (often relying on inferring depth from shadows or 2D images), specular reflections from fluids, smoke obscuring views, deformable anatomy, and the difficulty of identifying static anchor points. Despite these hurdles, continuous work over years has developed techniques for precise tool tracking, needle tracking, and suture thread tracking, enabling robots to accurately grasp objects. Reconstructing deformable environments is crucial not only for tracking anatomy during procedures like tumor excision but also for informing physics-based actions.

Modeling tissue physics for informed robotic actions

To bridge the gap between perception and action in surgery, understanding the physics and mechanics of deformable tissue is essential. This is often referred to as 'digital twinning.' Techniques like Position-Based Dynamics (PBD) are valuable because they can simulate deformable scenes faster than real-time and precisely satisfy position constraints. This speed allows for simulating multiple interaction possibilities, enabling model-based controllers. Critically, PBD can instantaneously match simulations to real-world observations from cameras by adjusting mechanical properties and camera pose, reducing prediction error from millimeters down to sub-millimeter accuracy. This approach is applicable to various structures, including fluid dynamics (estimating viscosity for hemorrhage control) and rope-like objects, informing manipulation of vessels and other tissues.

Incorporating physics and learning for cutting and dissection

Once a digital twin of the tissue is established, robots can be instructed on how to cut and dissect. A key challenge is identifying tissue connections, which can be detected by observing sharp discontinuities in texture when pulling on tissue. Bayesian inference can further enhance this by quantifying uncertainty in connection locations, allowing for safety-aware control strategies. Robots can then autonomously explore areas to find connections while adhering to energy thresholds, preventing accidental tearing. For cutting tasks, model predictive control, informed by the physics simulator, can guide the robot. If a cut fails, the system can analyze the simulation to identify ineffective regions and precisely target them for re-attempt, demonstrating a more controlled recovery than emergent behaviors in general foundation models. Examples include autonomous cutting of tissue on benchtop models like chicken flesh.

Advancing autonomy through knowledge-grounded reinforcement learning

While individual modules for perception, modeling, planning, and control can be engineered, scaling autonomous surgery requires a more integrated approach. Lifelong learning and combining learned or engineered behaviors are crucial. Knowledge-grounded reinforcement learning uses knowledge modules (representing specific actions like scanning with a camera or performing a cut) that are glued together by a sparse neural network. Through reinforcement learning, the system learns how to sequence these modules to accomplish longer tasks. This approach has been demonstrated on tasks from the Fundamentals of Laparoscopic Surgery curriculum, such as grasping needles and transferring objects, and can even learn multi-throw suturing. The ability to transfer learned skills from simulation to real-world robots (sim-to-real transfer) is a key component.

Humanoid robots and versatile human-robot teaming in healthcare

The diversity of robotic platforms, from specialized surgical arms to emerging humanoid robots, presents a scalability challenge. Humanoid robots, like the 'Surgi' robot developed, offer a compelling alternative. Initially focused on non-medical feats, these robots are now being explored for healthcare applications, including remote surgery avatars, autonomous surgical assistants, and nursing tasks. Unlike expensive, single-purpose surgical systems (e.g., da Vinci at over $3 million), humanoid robots could potentially perform a wider range of assistive roles, reducing the need for multiple specialized robots. While teleoperation of humanoid hands is difficult without precise tactile sensing, reinforcement learning enables them to learn grasping and manipulation of diverse instruments, including articulated tools like forceps and tongs.

The critical role of haptics and multi-directional force feedback

Effective human-robot interaction, especially in medical contexts, requires advanced haptic feedback. Manipulating delicate instruments or even everyday objects like buttons can be challenging with limited robot hand sensing. Traditional robot hands often have fewer degrees of freedom, size mismatches, and critically, limited tactile sensing. To address this, novel hardware has been developed, including haptic gloves that provide directional force feedback to each finger. This capability is crucial for tasks requiring precise force application, such as pushing buttons or sliding across surfaces, and is essential for enabling more intuitive teleoperation and for robots to learn complex manipulation tasks through reinforcement learning. This allows robots to hold and activate tools effectively, even those with triggers.

Addressing chronic wounds and improving patient dignity through robotics

Beyond surgical theaters, robots can address pressing needs in long-term care, such as managing chronic wounds. Approximately 2% of the North American population suffers from chronic wounds, requiring frequent bandage changes that place a significant burden on patients and caregivers. Robots, equipped with deformable modeling and optimization-based control, can be trained to perform these tasks autonomously. This includes peeling off bandages in a way that minimizes skin stretching and pain, preparing and applying new dressings. While still early-stage, this work aims to restore patient dignity and independence by providing autonomous assistance for complex daily care routines. For example, 3D-printed fingernails were developed for robot tape removal.

Future directions: integrating AI, improving hardware, and clinical collaboration

The path forward for autonomous medical robotics involves combining advanced perception (including potentially ultrasound imaging), robust physics-based modeling, sophisticated planning and control, and novel hardware. While foundation models offer powerful tools for vision and segmentation, their application in surgery requires specialized integration. The development of more sensitive and robust tactile sensors remains a significant, albeit challenging, material science problem. Crucially, all advancements must be developed in close collaboration with clinical teams to ensure relevance, safety, and efficacy. Future research will likely focus on creating more generalizable and adaptable robotic systems that retain explainability and ensure human supervision remains paramount, even as AI capabilities advance.

Mentioned in This Episode

●Products

●Companies

●Organizations

●Studies Cited

●Concepts

Common Questions

Robots are needed to address the growing shortage of skilled healthcare workers, including surgeons and nurses, and can offer advantages like 24/7 availability, precision beyond human capability, and the ability to scale expertise through fleet-wide updates.

Topics

Humanoid Robots Health & Longevity AI & Machine Learning Technology & Innovation Surgical Robotics Autonomous Systems AI In Healthcare Computer Vision Medical Technology Robot Learning Digital Twinning

Mentioned in this video

Companies

Tesla

Mentioned as an example of how software updates can introduce new autonomous skills, similar to how robots can receive fleet-wide updates.

Aesop

A voice-controlled robot used to hold a surgical camera for laparoscopic surgery, mentioned as an early concept in surgical autonomy.

Products

Da Vinci Surgical System

A prevalent teleoperated surgical robot that has been around for 25-30 years, used as a benchmark for comparison with newer autonomous systems.

G1 robot

A specific humanoid robot model used in a study, noted for having limitations in its range of motion compared to the da Vinci system, affecting teleoperation performance.

RoboDoc

An early robotic system used for orthopedic surgery, specifically for aligning implants, cited as an example of early surgical autonomy concepts.

Organizations

UC San Diego

The speaker's lab is located at UC San Diego, where they have been developing technologies for surgical autonomy over the past 10-11 years.

Stanford University

The speaker did their PhD at Stanford and is returning to give this talk, highlighting the institution's role in their research.

Concepts

Position Based Dynamics

A simulation method using particles and constraints, capable of running faster than real-time and satisfying position constraints exactly, used for simulating deformable scenes in robotics.

Bayesian Inference

A framework used to attribute uncertainty to predicted connections in tissue, enabling a safety-aware control approach for robots.

Differentiable Rendering

A technique that allows a simulation to match real-world observations by backpropagating the loss between simulation and camera images, enabling accurate estimation of tissue mechanics.

Studies & Research

Fundamentals of Laparoscopic Surgery

A standardized exam for laparoscopic surgeons, used as a benchmark task for learning and validating robotic behaviors in grasping and suturing.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free