skip to main content

Title: Collaborative Suturing: A Reinforcement Learning Approach to Automate Hand-off Task in Suturing for Surgical Robots
Over the past decade, Robot-Assisted Surgeries (RAS), have become more prevalent in facilitating successful operations. Of the various types of RAS, the domain of collaborative surgery has gained traction in medical research. Prominent examples include providing haptic feedback to sense tissue consistency, and automating sub-tasks during surgery such as cutting or needle hand-off - pulling and reorienting the needle after insertion during suturing. By fragmenting suturing into automated and manual tasks the surgeon could essentially control the process with one hand and also circumvent workspace restrictions imposed by the control interface present at the surgeon's side during the operation. This paper presents an exploration of a discrete reinforcement learning-based approach to automate the needle hand-off task. Users were asked to perform a simple running suture using the da Vinci Research Kit. The user trajectory was learnt by generating a sparse reward function and deriving an optimal policy using Q-learning. Trajectories obtained from three learnt policies were compared to the user defined trajectory. The results showed a root-mean-square error of [0.0044mm, 0.0027mm, 0.0020mm] in ℝ 3 . Additional trajectories from varying initial positions were produced from a single policy to simulate repeated passes of the hand-off task.
Authors:
; ; ; ; ;
Award ID(s):
1637759 1927275
Publication Date:
NSF-PAR ID:
10207764
Journal Name:
2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
Page Range or eLocation-ID:
1380 to 1386
Sponsoring Org:
National Science Foundation
More Like this
  1. An important problem in designing human-robot systems is the integration of human intent and performance in the robotic control loop, especially in complex tasks. Bimanual coordination is a complex human behavior that is critical in many fine motor tasks, including robot-assisted surgery. To fully leverage the capabilities of the robot as an intelligent and assistive agent, online recognition of bimanual coordination could be important. Robotic assistance for a suturing task, for example, will be fundamentally different during phases when the suture is wrapped around the instrument (i.e., making a c- loop), than when the ends of the suture are pulled apart. In this study, we develop an online recognition method of bimanual coordination modes (i.e., the directions and symmetries of right and left hand movements) using geometric descriptors of hand motion. We (1) develop this framework based on ideal trajectories obtained during virtual 2D bimanual path following tasks performed by human subjects operating Geomagic Touch haptic devices, (2) test the offline recognition accuracy of bi- manual direction and symmetry from human subject movement trials, and (3) evalaute how the framework can be used to characterize 3D trajectories of the da Vinci Surgical System’s surgeon-side manipulators during bimanual surgical training tasks.more »In the human subject trials, our geometric bimanual movement classification accuracy was 92.3% for movement direction (i.e., hands moving together, parallel, or away) and 86.0% for symmetry (e.g., mirror or point symmetry). We also show that this approach can be used for online classification of different bimanual coordination modes during needle transfer, making a C loop, and suture pulling gestures on the da Vinci system, with results matching the expected modes. Finally, we discuss how these online estimates are sensitive to task environment factors and surgeon expertise, and thus inspire future work that could leverage adaptive control strategies to enhance user skill during robot-assisted surgery.« less
  2. Despite significant developments in the design of surgical robots and automated techniques for objective evalua- tion of surgical skills, there are still challenges in ensuring safety in robot-assisted minimally-invasive surgery (RMIS). This pa- per presents a runtime monitoring system for the detection of executional errors during surgical tasks through the analysis of kinematic data. The proposed system incorporates dual Siamese neural networks and knowledge of surgical context, including surgical tasks and gestures, their distributional similarities, and common error modes, to learn the differences between normal and erroneous surgical trajectories from small training datasets. We evaluate the performance of the error detection using Siamese networks compared to single CNN and LSTM networks trained with different levels of contextual knowledge and training data, using the dry-lab demonstrations of the Suturing and Needle Passing tasks from the JIGSAWS dataset. Our results show that gesture specific task nonspecific Siamese networks obtain micro F1 scores of 0.94 (Siamese-CNN) and 0.95 (Siamese-LSTM), and perform better than single CNN (0.86) and LSTM (0.87) networks. These Siamese networks also outperform gesture nonspecific task specific Siamese-CNN and Siamese-LSTM models for Suturing and Needle Passing.
  3. The recent development of Robot-Assisted Minimally Invasive Surgery (RAMIS) has brought much benefit to ease the performance of complex Minimally Invasive Surgery (MIS) tasks and lead to more clinical outcomes. Compared to direct master-slave manipulation, semi-autonomous control for the surgical robot can enhance the efficiency of the operation, particularly for repetitive tasks. However, operating in a highly dynamic in-vivo environment is complex. Supervisory control functions should be included to ensure flexibility and safety during the autonomous control phase. This paper presents a haptic rendering interface to enable supervised semi-autonomous control for a surgical robot. Bayesian optimization is used to tune user-specific parameters during the surgical training process. User studies were conducted on a customized simulator for validation. Detailed comparisons are made between with and without the supervised semi-autonomous control mode in terms of the number of clutching events, task completion time, master robot end-effector trajectory and average control speed of the slave robot. The effectiveness of the Bayesian optimization is also evaluated, demonstrating that the optimized parameters can significantly improve users' performance. Results indicate that the proposed control method can reduce the operator's workload and enhance operation efficiency.
  4. Playing the cup-and-ball game is an intriguing task for robotics research since it abstracts important problem characteristics including system nonlinearity, contact forces and precise positioning as terminal goal. In this paper, we present a learning model based control strategy for the cup-and-ball game, where a Universal Robots UR5e manipulator arm learns to catch a ball in one of the cups on a Kendama. Our control problem is divided into two sub-tasks, namely (i) swinging the ball up in a constrained motion, and (ii) catching the free-falling ball. The swing-up trajectory is computed offline, and applied in open-loop to the arm. Subsequently, a convex optimization problem is solved online during the ball’s free-fall to control the manipulator and catch the ball. The controller utilizes noisy position feedback of the ball from an Intel RealSense D435 depth camera. We propose a novel iterative framework, where data is used to learn the support of the camera noise distribution iteratively in order to update the control policy. The probability of a catch with a fixed policy is computed empirically with a user specified number of roll-outs. Our design guarantees that probability of the catch increases in the limit, as the learned support nears themore »true support of the camera noise distribution. High-fidelity Mujoco simulations and preliminary experimental results support our theoretical analysis« less
  5. One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techniques can be tedious for human supervisors, add significant computation burden, and may visit dangerous states during training. We propose an off-policy approach that injects noise into the supervisor’s policy while demonstrating. This forces the supervisor to demonstrate how to recover from errors. We propose a new algorithm, DART (Disturbances for Augmenting Robot Trajectories), that collects demonstrations with injected noise, and optimizes the noise level to approximate the error of the robot’s trained policy during data collection. We compare DART with DAgger and Behavior Cloning in two domains: in simulation with an algorithmic supervisor on the MuJoCo tasks (Walker, Humanoid, Hopper, Half-Cheetah) and in physical experiments with human supervisors training a Toyota HSR robot to perform grasping in clutter. For high dimensional tasks like Humanoid, DART can be up to 3x faster in computation time and only decreases the supervisor’s cumulative reward by 5%more »during training, whereas DAgger executes policies that have 80% less cumulative reward than the supervisor. On the grasping in clutter task, DART obtains on average a 62% performance increase over Behavior Cloning.« less