Key Moments
Stanford Robotics Seminar ENGR319 | Spring 2026 | Unlocking Autonomous Medical Robotics
Key Moments
Autonomous surgical robots are being developed to address surgeon shortages, but current systems lack inherent skill, and advancements in AI are hindered by surgery's inherent complexity and safety demands.
Key Insights
The US faces a shortage of tens of thousands of surgeons and hundreds of thousands of nurses, creating a significant need for robotic assistance in healthcare.
Currently, all deployed surgical robots are teleoperated, meaning they possess zero inherent skill and rely entirely on the surgeon's expertise.
General robot autonomy models, like vision-language-action models, struggle in surgical settings due to scarce data, uncontrollable environments, and critical safety requirements.
A viable path to autonomous surgical robots involves building context awareness through advanced perception, incorporating multiple knowledge cues for lifelong learning, and finding common embodiments for large-scale deployment.
Developing autonomous surgical robots requires new hardware, such as multi-directional haptic gloves enabling precise force feedback, alongside sophisticated software for tasks like chronic wound care and autonomous bandage application.
Humanoid robots like 'Surgi' show potential for diverse healthcare roles beyond surgery, from remote avatars to autonomous nursing assistants, offering a more scalable solution than specialized robots.
The pressing need for surgical robots amid healthcare worker shortages
The healthcare industry faces a critical shortage of skilled labor, with tens of thousands of surgeons and hundreds of thousands of nurses lacking to meet patient demands. Robots offer a compelling solution due to their tireless nature, 24/7 availability, and potential for precision exceeding human capabilities. Unlike traditional linear training models, robots can be updated fleet-wide, akin to autonomous vehicles, enabling programmable and uniform expertise across multiple platforms. This scalability is crucial for addressing the growing global need for medical care.
Limitations of current teleoperated surgical robots
Existing surgical robots, such as the well-established da Vinci system, are primarily teleoperated. In these systems, the surgeon controls the robot's instruments directly, typically through a joystick console. While these robots enhance surgeon precision and reduce fatigue, they do not address the fundamental issue of skilled labor shortages. In fact, robotic surgery often requires a larger support team than traditional surgery. Furthermore, these systems possess zero inherent skill, with all operational capability derived from the human surgeon.
Challenges in applying general AI to surgical autonomy
Recent advancements in general robot autonomy, particularly large foundation models and vision-language-action models, excel in controlled environments with abundant data and minimal risk. However, surgery presents a starkly different scenario. Key challenges include scarce and private data, uncontrollable and non-resettable environments, and paramount safety requirements. The limited number of expert surgeons available for demonstrations, the low incentive for data collection during patient care, and the absence of mature world models capable of accurately simulating surgical complexities make direct application of these general AI techniques difficult and risky. These models often perform in the mid-to-high 90% accuracy range in simpler tasks, but surgical environments demand near-perfect reliability.
A structured approach to surgical perception and context awareness
Achieving surgical autonomy hinges on robust perception and context awareness. This involves precise robot proprioception (knowing instrument positions), accurate object localization, detailed 3D scene reconstruction, and scene fusion to build a comprehensive environmental understanding. Surgical environments are exceptionally challenging for perception systems due to narrow fields of view, poor depth sensing (often relying on inferring depth from shadows or 2D images), specular reflections from fluids, smoke obscuring views, deformable anatomy, and the difficulty of identifying static anchor points. Despite these hurdles, continuous work over years has developed techniques for precise tool tracking, needle tracking, and suture thread tracking, enabling robots to accurately grasp objects. Reconstructing deformable environments is crucial not only for tracking anatomy during procedures like tumor excision but also for informing physics-based actions.
Modeling tissue physics for informed robotic actions
To bridge the gap between perception and action in surgery, understanding the physics and mechanics of deformable tissue is essential. This is often referred to as 'digital twinning.' Techniques like Position-Based Dynamics (PBD) are valuable because they can simulate deformable scenes faster than real-time and precisely satisfy position constraints. This speed allows for simulating multiple interaction possibilities, enabling model-based controllers. Critically, PBD can instantaneously match simulations to real-world observations from cameras by adjusting mechanical properties and camera pose, reducing prediction error from millimeters down to sub-millimeter accuracy. This approach is applicable to various structures, including fluid dynamics (estimating viscosity for hemorrhage control) and rope-like objects, informing manipulation of vessels and other tissues.
Incorporating physics and learning for cutting and dissection
Once a digital twin of the tissue is established, robots can be instructed on how to cut and dissect. A key challenge is identifying tissue connections, which can be detected by observing sharp discontinuities in texture when pulling on tissue. Bayesian inference can further enhance this by quantifying uncertainty in connection locations, allowing for safety-aware control strategies. Robots can then autonomously explore areas to find connections while adhering to energy thresholds, preventing accidental tearing. For cutting tasks, model predictive control, informed by the physics simulator, can guide the robot. If a cut fails, the system can analyze the simulation to identify ineffective regions and precisely target them for re-attempt, demonstrating a more controlled recovery than emergent behaviors in general foundation models. Examples include autonomous cutting of tissue on benchtop models like chicken flesh.
Advancing autonomy through knowledge-grounded reinforcement learning
While individual modules for perception, modeling, planning, and control can be engineered, scaling autonomous surgery requires a more integrated approach. Lifelong learning and combining learned or engineered behaviors are crucial. Knowledge-grounded reinforcement learning uses knowledge modules (representing specific actions like scanning with a camera or performing a cut) that are glued together by a sparse neural network. Through reinforcement learning, the system learns how to sequence these modules to accomplish longer tasks. This approach has been demonstrated on tasks from the Fundamentals of Laparoscopic Surgery curriculum, such as grasping needles and transferring objects, and can even learn multi-throw suturing. The ability to transfer learned skills from simulation to real-world robots (sim-to-real transfer) is a key component.
Humanoid robots and versatile human-robot teaming in healthcare
The diversity of robotic platforms, from specialized surgical arms to emerging humanoid robots, presents a scalability challenge. Humanoid robots, like the 'Surgi' robot developed, offer a compelling alternative. Initially focused on non-medical feats, these robots are now being explored for healthcare applications, including remote surgery avatars, autonomous surgical assistants, and nursing tasks. Unlike expensive, single-purpose surgical systems (e.g., da Vinci at over $3 million), humanoid robots could potentially perform a wider range of assistive roles, reducing the need for multiple specialized robots. While teleoperation of humanoid hands is difficult without precise tactile sensing, reinforcement learning enables them to learn grasping and manipulation of diverse instruments, including articulated tools like forceps and tongs.
The critical role of haptics and multi-directional force feedback
Effective human-robot interaction, especially in medical contexts, requires advanced haptic feedback. Manipulating delicate instruments or even everyday objects like buttons can be challenging with limited robot hand sensing. Traditional robot hands often have fewer degrees of freedom, size mismatches, and critically, limited tactile sensing. To address this, novel hardware has been developed, including haptic gloves that provide directional force feedback to each finger. This capability is crucial for tasks requiring precise force application, such as pushing buttons or sliding across surfaces, and is essential for enabling more intuitive teleoperation and for robots to learn complex manipulation tasks through reinforcement learning. This allows robots to hold and activate tools effectively, even those with triggers.
Addressing chronic wounds and improving patient dignity through robotics
Beyond surgical theaters, robots can address pressing needs in long-term care, such as managing chronic wounds. Approximately 2% of the North American population suffers from chronic wounds, requiring frequent bandage changes that place a significant burden on patients and caregivers. Robots, equipped with deformable modeling and optimization-based control, can be trained to perform these tasks autonomously. This includes peeling off bandages in a way that minimizes skin stretching and pain, preparing and applying new dressings. While still early-stage, this work aims to restore patient dignity and independence by providing autonomous assistance for complex daily care routines. For example, 3D-printed fingernails were developed for robot tape removal.
Future directions: integrating AI, improving hardware, and clinical collaboration
The path forward for autonomous medical robotics involves combining advanced perception (including potentially ultrasound imaging), robust physics-based modeling, sophisticated planning and control, and novel hardware. While foundation models offer powerful tools for vision and segmentation, their application in surgery requires specialized integration. The development of more sensitive and robust tactile sensors remains a significant, albeit challenging, material science problem. Crucially, all advancements must be developed in close collaboration with clinical teams to ensure relevance, safety, and efficacy. Future research will likely focus on creating more generalizable and adaptable robotic systems that retain explainability and ensure human supervision remains paramount, even as AI capabilities advance.
Mentioned in This Episode
●Products
●Companies
●Organizations
●Studies Cited
●Concepts
Common Questions
Robots are needed to address the growing shortage of skilled healthcare workers, including surgeons and nurses, and can offer advantages like 24/7 availability, precision beyond human capability, and the ability to scale expertise through fleet-wide updates.
Topics
Mentioned in this video
A prevalent teleoperated surgical robot that has been around for 25-30 years, used as a benchmark for comparison with newer autonomous systems.
A specific humanoid robot model used in a study, noted for having limitations in its range of motion compared to the da Vinci system, affecting teleoperation performance.
An early robotic system used for orthopedic surgery, specifically for aligning implants, cited as an example of early surgical autonomy concepts.
A simulation method using particles and constraints, capable of running faster than real-time and satisfying position constraints exactly, used for simulating deformable scenes in robotics.
A framework used to attribute uncertainty to predicted connections in tissue, enabling a safety-aware control approach for robots.
A technique that allows a simulation to match real-world observations by backpropagating the loss between simulation and camera images, enabling accurate estimation of tissue mechanics.
More from Stanford Online
View all 48 summaries
69 minStanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence
61 minStanford CS153 Frontier Systems | Scott Nolan from General Matter on Energy Bottlenecks
62 minStanford CS25: Transformers United V6 I The Ultra-Scale Talk: Scaling Training to Thousands of GPUs
107 minStanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 5 - Architectures
Ask anything from this episode.
Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.
Get Started Free