skip to main content


Title: Local Style Preservation in Improved GAN-Driven Synthetic Image Generation for Endoscopic Tool Segmentation
Accurate semantic image segmentation from medical imaging can enable intelligent vision-based assistance in robot-assisted minimally invasive surgery. The human body and surgical procedures are highly dynamic. While machine-vision presents a promising approach, sufficiently large training image sets for robust performance are either costly or unavailable. This work examines three novel generative adversarial network (GAN) methods of providing usable synthetic tool images using only surgical background images and a few real tool images. The best of these three novel approaches generates realistic tool textures while preserving local background content by incorporating both a style preservation and a content loss component into the proposed multi-level loss function. The approach is quantitatively evaluated, and results suggest that the synthetically generated training tool images enhance UNet tool segmentation performance. More specifically, with a random set of 100 cadaver and live endoscopic images from the University of Washington Sinus Dataset, the UNet trained with synthetically generated images using the presented method resulted in 35.7% and 30.6% improvement over using purely real images in mean Dice coefficient and Intersection over Union scores, respectively. This study is promising towards the use of more widely available and routine screening endoscopy to preoperatively generate synthetic training tool images for intraoperative UNet tool segmentation.  more » « less
Award ID(s):
2101107
NSF-PAR ID:
10326933
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Sensors
Volume:
21
Issue:
15
ISSN:
1424-8220
Page Range / eLocation ID:
5163
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Global warming is one of the world’s most pressing issues. The study of its effects on the polar ice caps and other arctic environments, however, can be hindered by the often dangerous and difficult to navigate terrain found there. Multi-terrain autonomous vehicles can assist researchers by providing a mobile platform on which to collect data in these harsh environments while avoiding any risk to human life and speeding up the research process. The mechanical design and ultimate efficacy of these autonomous robotic vehicles depends largely on the specific missions they are deployed for, but terrain conditions can vary wildly geographically as well as seasonally, making mission planning for these unmanned vehicles more difficult. This paper proposes the use of various UNet-based neural network architectures to generate digital elevation maps from satellite images, and explores and compares their efficacy on a single set of training and validation datasets generated from satellite imagery. These digital elevation maps generated by the model could be used by researchers not only to track the change in arctic topography over time, but to quickly provide autonomous exploratory research rovers with the topographical information necessary to decide on optimal paths during the mission. This paper analyzes different model architectures and training schemes: a traditional UNet, a traditional UNet with data augmentation, a UNet with a single active skip-layer vision transformer (ViT), and a UNet with multiple active skip-layer ViT. Each model was trained on a dataset of satellite images and corresponding digital elevation maps of Ellesmere Island, Canada. Utilizing ViTs did not demonstrate a significant improvement in UNet performance, though this could change with longer training. This paper proposes opportunities to improve performance for these neural networks, as well as next steps for further research, including improving the diversity of images in the dataset, generating a testing dataset from a completely different geographic location, and allowing the models more time to train. 
    more » « less
  2. null (Ed.)
    Robot-assisted minimally invasive surgery com- bines the skills and techniques of highly-trained surgeons with the robustness and precision of machines. Several advantages include precision beyond human dexterity alone, greater kinematic degrees of freedom at the surgical tool tip, and possibilities in remote surgical practices through teleoperation. Nevertheless, obtaining accurate force feedback during surgical operations remains a challenging hurdle. Though direct force sensing using tool tip mounted sensors is theoretically possible, it is not amenable to required sterilization procedures. Vision-based force estimation according to real-time analysis of tissue deformation serves as a promising alternative. In this application, along with numerous related research in robot- assisted minimally invasive surgery, segmentation of surgical instruments in endoscopic images is a prerequisite. Thus, a surgical tool segmentation algorithm robust to partial occlusion is proposed using DFT shape matching of robot kinematics shape prior (u) fused with log likelihood mask (Q) in the Opponent color space to generate final mask (U). Implemented on the Raven II surgical robot system, a real-time performance robust to tool tip orientation and up to 6 fps without GPU acceleration is achieved. 
    more » « less
  3. Annotating medical images for the purposes of training computer vision models is an extremely laborious task that takes time and resources away from expert clinicians. Active learning (AL) is a machine learning paradigm that mitigates this problem by deliberately proposing data points that should be labeled in order to maximize model performance. We propose a novel AL algorithm for segmentation, ALGES, that utilizes gradient embeddings to effectively select laparoscopic images to be labeled by some external oracle while reducing annotation effort. Given any unlabeled image, our algorithm treats predicted segmentations as truth and computes gradients with respect to the model parameters of the last layer in a segmentation network. The norms of these per-pixel gradient vectors correspond to the magnitude of the induced change in model parameters and contain rich information about the model’s predictive uncertainty. Our algorithm then computes gradients embeddings in two ways, and we employ a center-finding algorithm with these embeddings to procure representative and diverse batches in each round of AL. An advantage of our approach is extensibility to any model architecture and differentiable loss scheme for semantic segmentation. We apply our approach to a public data set of laparoscopic cholecystectomy images and show that it outperforms current AL algorithms in selecting the most informative data points for improving the segmentation model. Our code is available at https://github.com/josaklil-ai/surg-active-learning. 
    more » « less
  4. Abstract A recent discovery in neuroscience prompts the need for innovation in image analysis. Neuroscientists have discovered the existence of meningeal lymphatic vessels in the brain and have shown their importance in preventing cognitive decline in mouse models of Alzheimer’s disease. With age, lymphatic vessels narrow and poorly drain cerebrospinal fluid, leading to plaque accumulation, a marker for Alzheimer’s disease. The detection of vessel boundaries and width are performed by hand in current practice and thereby suffer from high error rates and potential observer bias. The existing vessel segmentation methods are dependent on user-defined initialization, which is time-consuming and difficult to achieve in practice due to high amounts of background clutter and noise. This work proposes a level set segmentation method featuring hierarchical matting, LyMPhi, to predetermine foreground and background regions. The level set force field is modulated by the foreground information computed by matting, while also constraining the segmentation contour to be smooth. Segmentation output from this method has a higher overall Dice coefficient and boundary F1-score compared to that of competing algorithms. The algorithms are tested on real and synthetic data generated by our novel shape deformation based approach. LyMPhi is also shown to be more stable under different initial conditions as compared to existing level set segmentation methods. Finally, statistical analysis on manual segmentation is performed to prove the variation and disagreement between three annotators. 
    more » « less
  5. This paper presents an approach to enhanced endoscopic tool segmentation combining separate pathways utilizing input images in two different coordinate representations. The proposed method examines U-Net convolutional neural networks with input endoscopic images represented via (1) the original rectangular coordinate format alongside (2) a morphological polar coordinate transformation. To maximize information and the breadth of the endoscope frustrum, imaging sensors are oftentimes larger than the image circle. This results in unused border regions. Ideally, the region of interest is proximal to the image center. The above two observations formed the basis for the morphological polar transformation pathway as an augmentation to typical rectangular input image representations. Results indicate that neither of the two investigated coordinate representations consistently yielded better segmentation performance as compared to the other. Improved segmentation can be achieved with a hybrid approach that carefully selects which of the two pathways to be used for individual input images. Towards that end, two binary classifiers were trained to identify, given an input endoscopic image, which of the two coordinate representation segmentation pathways (rectangular or polar), would result in better segmentation performance. Results are promising and suggest marked improvements using a hybrid pathway selection approach compared to either alone. The experiment used to evaluate the proposed hybrid method utilized a dataset consisting of 8360 endoscopic images from real surgery and evaluated segmentation performance with Dice coefficient and Intersection over Union. The results suggest that on-the-fly polar transformation for tool segmentation is useful when paired with the proposed hybrid tool-segmentation approach. 
    more » « less