skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Autonomous Neurosurgical Instrument Segmentation Using End-to-End Learning
Monitoring surgical instruments is an essential task in computer-assisted interventions and surgical robotics. It is also important for navigation, data analysis, skill as- sessment and surgical workflow analysis in conventional surgery. However, there are no standard datasets and benchmarks for tool identification in neurosurgery. To this end, we are releasing a novel neurosurgical instrument seg- mentation dataset called NeuroID for advancing research in the field. Delineating surgical tools from the background requires accurate pixel-wise instrument segmentation. In this paper, we present a comparison between three encoder- decoder approaches to binary segmentation of neurosurgi- cal instruments, where we classify each pixel in the image to be either tool or background. A baseline performance was obtained by using heuristics to combine extracted features. We also extend the analysis to a publicly available robotic instrument segmentation dataset and include its results. The source code for our methods and the neurosurgical instru- ment dataset will be made publicly available1 to facilitate reproducibility.  more » « less
Award ID(s):
1637444
PAR ID:
10117603
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate semantic image segmentation from medical imaging can enable intelligent vision-based assistance in robot-assisted minimally invasive surgery. The human body and surgical procedures are highly dynamic. While machine-vision presents a promising approach, sufficiently large training image sets for robust performance are either costly or unavailable. This work examines three novel generative adversarial network (GAN) methods of providing usable synthetic tool images using only surgical background images and a few real tool images. The best of these three novel approaches generates realistic tool textures while preserving local background content by incorporating both a style preservation and a content loss component into the proposed multi-level loss function. The approach is quantitatively evaluated, and results suggest that the synthetically generated training tool images enhance UNet tool segmentation performance. More specifically, with a random set of 100 cadaver and live endoscopic images from the University of Washington Sinus Dataset, the UNet trained with synthetically generated images using the presented method resulted in 35.7% and 30.6% improvement over using purely real images in mean Dice coefficient and Intersection over Union scores, respectively. This study is promising towards the use of more widely available and routine screening endoscopy to preoperatively generate synthetic training tool images for intraoperative UNet tool segmentation. 
    more » « less
  2. null (Ed.)
    Robot-assisted minimally invasive surgery com- bines the skills and techniques of highly-trained surgeons with the robustness and precision of machines. Several advantages include precision beyond human dexterity alone, greater kinematic degrees of freedom at the surgical tool tip, and possibilities in remote surgical practices through teleoperation. Nevertheless, obtaining accurate force feedback during surgical operations remains a challenging hurdle. Though direct force sensing using tool tip mounted sensors is theoretically possible, it is not amenable to required sterilization procedures. Vision-based force estimation according to real-time analysis of tissue deformation serves as a promising alternative. In this application, along with numerous related research in robot- assisted minimally invasive surgery, segmentation of surgical instruments in endoscopic images is a prerequisite. Thus, a surgical tool segmentation algorithm robust to partial occlusion is proposed using DFT shape matching of robot kinematics shape prior (u) fused with log likelihood mask (Q) in the Opponent color space to generate final mask (U). Implemented on the Raven II surgical robot system, a real-time performance robust to tool tip orientation and up to 6 fps without GPU acceleration is achieved. 
    more » « less
  3. Purpose: We propose a formal framework for the modeling and segmentation of minimally invasive surgical tasks using a unified set of motion primitives (MPs) to enable more objective labeling and the aggregation of different datasets. Methods: We model dry-lab surgical tasks as finite state machines, representing how the execution of MPs as the basic surgical actions results in the change of surgical context, which characterizes the physical interactions among tools and objects in the surgical environment. We develop methods for labeling surgical context based on video data and for automatic translation of context to MP labels. We then use our framework to create the COntext and Motion Primitive Aggregate Surgical Set (COMPASS), including six dry-lab surgical tasks from three publicly available datasets (JIGSAWS, DESK, and ROSMA), with kinematic and video data and context and MP labels. Results: Our context labeling method achieves near-perfect agreement between consensus labels from crowd-sourcing and expert surgeons. Segmentation of tasks to MPs results in the creation of the COMPASS dataset that nearly triples the amount of data for modeling and analysis and enables the generation of separate transcripts for the left and right tools. Conclusion: The proposed framework results in high quality labeling of surgical data based on context and fine-grained MPs. Modeling surgical tasks with MPs enables the aggregation of different datasets and the separate analysis of left and right hands for bimanual coordination assessment. Our formal framework and aggregate dataset can support the development of explainable and multi-granularity models for improved surgical process analysis, skill assessment, error detection, and autonomy. 
    more » « less
  4. Augmented Reality (AR) is increasingly used in medical applications for visualizing medical information. In this paper, we present an AR-assisted surgical guidance system that aims to improve the accuracy of catheter placement in ventriculostomy, a common neurosurgical procedure. We build upon previous work on neurosurgical AR, which has focused on enabling the surgeon to visualize a patient’s ventricular anatomy, to additionally integrate surgical tool tracking and contextual guidance. Specifically, using accurate tracking of optical markers via an external multi-camera OptiTrack system, we enable Microsoft HoloLens 2-based visualizations of ventricular anatomy, catheter placement, and the information on how far the catheter tip is from its target. We describe the system we developed, present initial hologram registration results, and comment on the next steps that will prepare our system for clinical evaluations. 
    more » « less
  5. The successful implementation of vision-based navigation in agricultural fields hinges upon two critical components: 1) the accurate identification of key components within the scene, and 2) the identification of lanes through the detection of boundary lines that separate the crops from the traversable ground. We propose Agronav, an end-to-end vision-based autonomous navigation framework, which outputs the centerline from the input image by sequentially processing it through semantic segmentation and semantic line detection models. We also present Agroscapes, a pixel-level annotated dataset collected across six different crops, captured from varying heights and angles. This ensures that the framework trained on Agroscapes is generalizable across both ground and aerial robotic platforms. Codes, models and dataset will be publicly released. 
    more » « less