skip to main content

Title: ASAP-CORPS: A Semi-Autonomous Platform for COntact-Rich Precision Surgery
ABSTRACT Introduction

Remote military operations require rapid response times for effective relief and critical care. Yet, the military theater is under austere conditions, so communication links are unreliable and subject to physical and virtual attacks and degradation at unpredictable times. Immediate medical care at these austere locations requires semi-autonomous teleoperated systems, which enable the completion of medical procedures even under interrupted networks while isolating the medics from the dangers of the battlefield. However, to achieve autonomy for complex surgical and critical care procedures, robots require extensive programming or massive libraries of surgical skill demonstrations to learn effective policies using machine learning algorithms. Although such datasets are achievable for simple tasks, providing a large number of demonstrations for surgical maneuvers is not practical. This article presents a method for learning from demonstration, combining knowledge from demonstrations to eliminate reward shaping in reinforcement learning (RL). In addition to reducing the data required for training, the self-supervised nature of RL, in conjunction with expert knowledge-driven rewards, produces more generalizable policies tolerant to dynamic environment changes. A multimodal representation for interaction enables learning complex contact-rich surgical maneuvers. The effectiveness of the approach is shown using the cricothyroidotomy task, as it is a standard procedure seen in critical care to open the airway. In addition, we also provide a method for segmenting the teleoperator’s demonstration into subtasks and classifying the subtasks using sequence modeling.

Materials and Methods

A database of demonstrations for the cricothyroidotomy task was collected, comprising six fundamental maneuvers referred to as surgemes. The dataset was collected by teleoperating a collaborative robotic platform—SuperBaxter, with modified surgical grippers. Then, two learning models are developed for processing the dataset—one for automatic segmentation of the task demonstrations into a sequence of surgemes and the second for classifying each segment into labeled surgemes. Finally, a multimodal off-policy RL with rewards learned from demonstrations was developed to learn the surgeme execution from these demonstrations.


The task segmentation model has an accuracy of 98.2%. The surgeme classification model using the proposed interaction features achieved a classification accuracy of 96.25% averaged across all surgemes compared to 87.08% without these features and 85.4% using a support vector machine classifier. Finally, the robot execution achieved a task success rate of 93.5% compared to baselines of behavioral cloning (78.3%) and a twin-delayed deep deterministic policy gradient with shaped rewards (82.6%).


Results indicate that the proposed interaction features for the segmentation and classification of surgical tasks improve classification accuracy. The proposed method for learning surgemes from demonstrations exceeds popular methods for skill learning. The effectiveness of the proposed approach demonstrates the potential for future remote telemedicine on battlefields.

more » « less
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Military Medicine
Medium: X Size: p. 412-419
["p. 412-419"]
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    ABSTRACT Introduction Short response time is critical for future military medical operations in austere settings or remote areas. Such effective patient care at the point of injury can greatly benefit from the integration of semi-autonomous robotic systems. To achieve autonomy, robots would require massive libraries of maneuvers collected with the goal of training machine learning algorithms. Although this is attainable in controlled settings, obtaining surgical data in austere settings can be difficult. Hence, in this article, we present the Dexterous Surgical Skill (DESK) database for knowledge transfer between robots. The peg transfer task was selected as it is one of the six main tasks of laparoscopic training. In addition, we provide a machine learning framework to evaluate novel transfer learning methodologies on this database. Methods A set of surgical gestures was collected for a peg transfer task, composed of seven atomic maneuvers referred to as surgemes. The collected Dexterous Surgical Skill dataset comprises a set of surgical robotic skills using the four robotic platforms: Taurus II, simulated Taurus II, YuMi, and the da Vinci Research Kit. Then, we explored two different learning scenarios: no-transfer and domain-transfer. In the no-transfer scenario, the training and testing data were obtained from the same domain; whereas in the domain-transfer scenario, the training data are a blend of simulated and real robot data, which are tested on a real robot. Results Using simulation data to train the learning algorithms enhances the performance on the real robot where limited or no real data are available. The transfer model showed an accuracy of 81% for the YuMi robot when the ratio of real-tosimulated data were 22% to 78%. For the Taurus II and the da Vinci, the model showed an accuracy of 97.5% and 93%, respectively, training only with simulation data. Conclusions The results indicate that simulation can be used to augment training data to enhance the performance of learned models in real scenarios. This shows potential for the future use of surgical data from the operating room in deployable surgical robots in remote areas. 
    more » « less
  2. Abstract Background Analyzing single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the identification of cell types. With the availability of a huge amount of single cell sequencing data and discovering more and more cell types, classifying cells into known cell types has become a priority nowadays. Several methods have been introduced to classify cells utilizing gene expression data. However, incorporating biological gene interaction networks has been proved valuable in cell classification procedures. Results In this study, we propose a multimodal end-to-end deep learning model, named sigGCN, for cell classification that combines a graph convolutional network (GCN) and a neural network to exploit gene interaction networks. We used standard classification metrics to evaluate the performance of the proposed method on the within-dataset classification and the cross-dataset classification. We compared the performance of the proposed method with those of the existing cell classification tools and traditional machine learning classification methods. Conclusions Results indicate that the proposed method outperforms other commonly used methods in terms of classification accuracy and F1 scores. This study shows that the integration of prior knowledge about gene interactions with gene expressions using GCN methodologies can extract effective features improving the performance of cell classification. 
    more » « less
  3. It has been shown that intraoperative stress can have a negative effect on surgeon surgical skills during laparoscopic procedures. For novice surgeons, stressful conditions can lead to significantly higher velocity, acceleration, and jerk of the surgical instrument tips, resulting in faster but less smooth movements. However, it is still not clear which of these kinematic features (velocity, acceleration, or jerk) is the best marker for identifying the normal and stressed conditions. Therefore, in order to find the most significant kinematic feature that is affected by intraoperative stress, we implemented a spatial attention-based Long Short-Term Memory (LSTM) classifier. In a prior IRB approved experiment, we collected data from medical students performing an extended peg transfer task who were randomized into a control group and a group performing the task under external psychological stresses. In our prior work, we obtained “representative” normal or stressed movements from this dataset using kinematic data as the input. In this study, a spatial attention mechanism is used to describe the contribution of each kinematic feature to the classification of normal/stressed movements. We tested our classifier under Leave-One-User-Out (LOUO) cross-validation, and the classifier reached an overall accuracy of 77.11% for classifying “representative” normal and stressed movements using kinematic features as the input. More importantly, we also studied the spatial attention extracted from the proposed classifier. Velocity and acceleration on both sides had significantly higher attention for classifying a normal movement ([Formula: see text]); Velocity ([Formula: see text]) and jerk ([Formula: see text]) on nondominant hand had significant higher attention for classifying a stressed movement, and it is worthy noting that the attention of jerk on nondominant hand side had the largest increment when moving from describing normal movements to stressed movements ([Formula: see text]). In general, we found that the jerk on nondominant hand side can be used for characterizing the stressed movements for novice surgeons more effectively. 
    more » « less
  4. Abstract Background

    Analysing kinematic and video data can help identify potentially erroneous motions that lead to sub‐optimal surgeon performance and safety‐critical events in robot‐assisted surgery.


    We develop a rubric for identifying task and gesture‐specific executional and procedural errors and evaluate dry‐lab demonstrations of suturing and needle passing tasks from the JIGSAWS dataset. We characterise erroneous parts of demonstrations by labelling video data, and use distribution similarity analysis and trajectory averaging on kinematic data to identify parameters that distinguish erroneous gestures.


    Executional error frequency varies by task and gesture, and correlates with skill level. Some predominant error modes in each gesture are distinguishable by analysing error‐specific kinematic parameters. Procedural errors could lead to lower performance scores and increased demonstration times but also depend on surgical style.


    This study provides insights into context‐dependent errors that can be used to design automated error detection mechanisms and improve training and skill assessment.

    more » « less

    Classification of perioperative risk is important for patient care, resource allocation, and guiding shared decision-making. Using discriminative features from the electronic health record (EHR), machine-learning algorithms can create digital phenotypes among heterogenous populations, representing distinct patient subpopulations grouped by shared characteristics, from which we can personalize care, anticipate clinical care trajectories, and explore therapies. We hypothesized that digital phenotypes in preoperative settings are associated with postoperative adverse events including in-hospital and 30-day mortality, 30-day surgical redo, intensive care unit (ICU) admission, and hospital length of stay (LOS).


    We identified all laminectomies, colectomies, and thoracic surgeries performed over a 9-year period from a large hospital system. Seventy-seven readily extractable preoperative features were first selected from clinical consensus, including demographics, medical history, and lab results. Three surgery-specific datasets were built and split into derivation and validation cohorts using chronological occurrence. Consensusk-means clustering was performed independently on each derivation cohort, from which phenotypes’ characteristics were explored. Cluster assignments were used to train a random forest model to assign patient phenotypes in validation cohorts. We reconducted descriptive analyses on validation cohorts to confirm the similarity of patient characteristics with derivation cohorts, and quantified the association of each phenotype with postoperative adverse events by using the area under receiver operating characteristic curve (AUROC). We compared our approach to American Society of Anesthesiologists (ASA) alone and investigated a combination of our phenotypes with the ASA score.


    A total of 7251 patients met inclusion criteria, of which 2770 were held out in a validation dataset based on chronological occurrence. Using segmentation metrics and clinical consensus, 3 distinct phenotypes were created for each surgery. The main features used for segmentation included urgency of the procedure, preoperative LOS, age, and comorbidities. The most relevant characteristics varied for each of the 3 surgeries. Low-risk phenotype alpha was the most common (2039 of 2770, 74%), while high-risk phenotype gamma was the rarest (302 of 2770, 11%). Adverse outcomes progressively increased from phenotypes alpha to gamma, including 30-day mortality (0.3%, 2.1%, and 6.0%, respectively), in-hospital mortality (0.2%, 2.3%, and 7.3%), and prolonged hospital LOS (3.4%, 22.1%, and 25.8%). When combined with the ASA score, digital phenotypes achieved higher AUROC than the ASA score alone (hospital mortality: 0.91 vs 0.84; prolonged hospitalization: 0.80 vs 0.71).


    For 3 frequently performed surgeries, we identified 3 digital phenotypes. The typical profiles of each phenotype were described and could be used to anticipate adverse postoperative events.

    more » « less