This paper presents a tool-pose-informed variable center morphological polar transform to enhance segmentation of endoscopic images. The representation, while not loss-less, transforms rigid tool shapes into morphologies consistently more rectangular that may be more amenable to image segmentation networks. The proposed method was evaluated using the U-Net convolutional neural network, and the input images from endoscopy were represented in one of the four different coordinate formats (1) the original rectangular image representation, (2) the morphological polar coordinate transform, (3) the proposed variable center transform about the tool-tip pixel and (4) the proposed variable center transform about the tool vanishing point pixel. Previous work relied on the observations that endoscopic images typically exhibit unused border regions with content in the shape of a circle (since the image sensor is designed to be larger than the image circle to maximize available visual information in the constrained environment) and that the region of interest (ROI) was most ideally near the endoscopic image center. That work sought an intelligent method for, given an input image, carefully selecting between methods (1) and (2) for best image segmentation prediction. In this extension, the image center reference constraint for polar transformation in method (2) is relaxed via the development of a variable center morphological transformation. Transform center selection leads to different spatial distributions of image loss, and the transform-center location can be informed by robot kinematic model and endoscopic image data. In particular, this work is examined using the tool-tip and tool vanishing point on the image plane as candidate centers. The experiments were conducted for each of the four image representations using a data set of 8360 endoscopic images from real sinus surgery. The segmentation performance was evaluated with standard metrics, and some insight about loss and tool location effects on performance are provided. Overall, the results are promising, showing that selecting a transform center based on tool shape features using the proposed method can improve segmentation performance.
Local Style Preservation in Improved GAN-Driven Synthetic Image Generation for Endoscopic Tool Segmentation
Accurate semantic image segmentation from medical imaging can enable intelligent vision-based assistance in robot-assisted minimally invasive surgery. The human body and surgical procedures are highly dynamic. While machine-vision presents a promising approach, sufficiently large training image sets for robust performance are either costly or unavailable. This work examines three novel generative adversarial network (GAN) methods of providing usable synthetic tool images using only surgical background images and a few real tool images. The best of these three novel approaches generates realistic tool textures while preserving local background content by incorporating both a style preservation and a content loss component into the proposed multi-level loss function. The approach is quantitatively evaluated, and results suggest that the synthetically generated training tool images enhance UNet tool segmentation performance. More specifically, with a random set of 100 cadaver and live endoscopic images from the University of Washington Sinus Dataset, the UNet trained with synthetically generated images using the presented method resulted in 35.7% and 30.6% improvement over using purely real images in mean Dice coefficient and Intersection over Union scores, respectively. This study is promising towards the use of more widely available and routine screening endoscopy to preoperatively generate synthetic training tool images for intraoperative UNet tool segmentation.
more »
« less
- Award ID(s):
- 2101107
- NSF-PAR ID:
- 10326933
- Date Published:
- Journal Name:
- Sensors
- Volume:
- 21
- Issue:
- 15
- ISSN:
- 1424-8220
- Page Range / eLocation ID:
- 5163
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Robot-assisted minimally invasive surgery com- bines the skills and techniques of highly-trained surgeons with the robustness and precision of machines. Several advantages include precision beyond human dexterity alone, greater kinematic degrees of freedom at the surgical tool tip, and possibilities in remote surgical practices through teleoperation. Nevertheless, obtaining accurate force feedback during surgical operations remains a challenging hurdle. Though direct force sensing using tool tip mounted sensors is theoretically possible, it is not amenable to required sterilization procedures. Vision-based force estimation according to real-time analysis of tissue deformation serves as a promising alternative. In this application, along with numerous related research in robot- assisted minimally invasive surgery, segmentation of surgical instruments in endoscopic images is a prerequisite. Thus, a surgical tool segmentation algorithm robust to partial occlusion is proposed using DFT shape matching of robot kinematics shape prior (u) fused with log likelihood mask (Q) in the Opponent color space to generate final mask (U). Implemented on the Raven II surgical robot system, a real-time performance robust to tool tip orientation and up to 6 fps without GPU acceleration is achieved.more » « less
-
Wearable Cognitive Assistance (WCA) applications use computer vision models that require thousands of labeled training images. Capturing and labeling these images requires a substantial amount of work. By using synthetically generated images for training, we avoided this labor-intensive step. The performance of these models was comparable to that of models that were trained on real images.more » « less
-
Monitoring surgical instruments is an essential task in computer-assisted interventions and surgical robotics. It is also important for navigation, data analysis, skill as- sessment and surgical workflow analysis in conventional surgery. However, there are no standard datasets and benchmarks for tool identification in neurosurgery. To this end, we are releasing a novel neurosurgical instrument seg- mentation dataset called NeuroID for advancing research in the field. Delineating surgical tools from the background requires accurate pixel-wise instrument segmentation. In this paper, we present a comparison between three encoder- decoder approaches to binary segmentation of neurosurgi- cal instruments, where we classify each pixel in the image to be either tool or background. A baseline performance was obtained by using heuristics to combine extracted features. We also extend the analysis to a publicly available robotic instrument segmentation dataset and include its results. The source code for our methods and the neurosurgical instru- ment dataset will be made publicly available1 to facilitate reproducibility.more » « less
-
Significant resources have been spent in collecting and storing large and heterogeneous radar datasets during expensive Arctic and Antarctic fieldwork. The vast majority of data available is unlabeled, and the labeling process is both time-consuming and expensive. One possible alternative to the labeling process is the use of synthetically generated data with artificial intelligence. Instead of labeling real images, we can generate synthetic data based on arbitrary labels. In this way, training data can be quickly augmented with additional images. In this research, we evaluated the performance of synthetically generated radar images based on modified cycle-consistent adversarial networks. We conducted several experiments to test the quality of the generated radar imagery. We also tested the quality of a state-of-the-art contour detection algorithm on synthetic data and different combinations of real and synthetic data. Our experiments show that synthetic radar images generated by generative adversarial network (GAN) can be used in combination with real images for data augmentation and training of deep neural networks. However, the synthetic images generated by GANs cannot be used solely for training a neural network (training on synthetic and testing on real) as they cannot simulate all of the radar characteristics such as noise or Doppler effects. To the best of our knowledge, this is the first work in creating radar sounder imagery based on generative adversarial network.more » « less