Accurate semantic image segmentation from medical imaging can enable intelligent vision-based assistance in robot-assisted minimally invasive surgery. The human body and surgical procedures are highly dynamic. While machine-vision presents a promising approach, sufficiently large training image sets for robust performance are either costly or unavailable. This work examines three novel generative adversarial network (GAN) methods of providing usable synthetic tool images using only surgical background images and a few real tool images. The best of these three novel approaches generates realistic tool textures while preserving local background content by incorporating both a style preservation and a content loss component into the proposed multi-level loss function. The approach is quantitatively evaluated, and results suggest that the synthetically generated training tool images enhance UNet tool segmentation performance. More specifically, with a random set of 100 cadaver and live endoscopic images from the University of Washington Sinus Dataset, the UNet trained with synthetically generated images using the presented method resulted in 35.7% and 30.6% improvement over using purely real images in mean Dice coefficient and Intersection over Union scores, respectively. This study is promising towards the use of more widely available and routine screening endoscopy to preoperatively generate synthetic training tool images for intraoperative UNet tool segmentation.
more »
« less
This content will become publicly available on October 1, 2025
Reducing annotating load: Active learning with synthetic images in surgical instrument segmentation
Accurate instrument segmentation in the endoscopic vision of minimally invasive surgery is challenging due to complex instruments and environments. Deep learning techniques have shown competitive performance in recent years. However, deep learning usually requires a large amount of labeled data to achieve accurate prediction, which poses a significant workload. To alleviate this workload, we propose an active learning-based framework to generate synthetic images for efficient neural network training. In each active learning iteration, a small number of informative unlabeled images are first queried by active learning and manually labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with blending and fusion near the boundary. The proposed method leverages the advantage of both active learning and synthetic images. The effectiveness of the proposed method is validated on two sinus surgery datasets and one intraabdominal surgery dataset. The results indicate a considerable performance improvement, especially when the size of the annotated dataset is small. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator
more »
« less
- Award ID(s):
- 2101107
- PAR ID:
- 10552163
- Publisher / Repository:
- Elsevier
- Date Published:
- Journal Name:
- Medical Image Analysis
- Edition / Version:
- 97
- Volume:
- 97
- Issue:
- C
- ISSN:
- 1361-8415
- Page Range / eLocation ID:
- 103246
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Traumatic brain injury (TBI) is a massive public health problem worldwide. Accurate and fast automatic brain hematoma segmentation is important for TBI diagnosis, treatment and outcome prediction. In this study, we developed a fully automated system to detect and segment hematoma regions in head Computed Tomography (CT) images of patients with acute TBI. We first over-segmented brain images into superpixels and then extracted statistical and textural features to capture characteristics of superpixels. To overcome the shortage of annotated data, an uncertainty-based active learning strategy was designed to adaptively and iteratively select the most informative unlabeled data to be annotated for training a Support Vector Machine classifier (SVM). Finally, the coarse segmentation from the SVM classifier was incorporated into an active contour model to improve the accuracy of the segmentation. From our experiments, the proposed active learning strategy can achieve a comparable result with 5 times fewer labeled data compared with regular machine learning. Our proposed automatic hematoma segmentation system achieved an average Dice coefficient of 0.60 on our dataset, where patients are from multiple health centers and at multiple levels of injury. Our results show that the proposed method can effectively overcome the challenge of limited and highly varied dataset.more » « less
-
Rolling bearing is a critical component of machinery that has been widely applied in manufacturing, transportation, aerospace, and power and energy industries. The timely and accurate bearing fault detection thus is of vital importance. Computational data-driven deep learning has recently become a prevailing approach for bearing fault detection. Despite the progress of the deep learning approach, the deep learning performance is hinged upon the size of labeled data, the acquisition of which is expensive in actual implementation. Unlabeled data, on the other hand, are inexpensive. In this research, we develop a new semi-supervised learning method built upon the autoencoder to fully utilize a large amount of unlabeled data together with limited labeled data to enhance fault detection performance. Compared with the state-of-the-art semi-supervised learning methods, this proposed method can be more conveniently implemented with fewer hyperparameters to be tuned. In this method, a joint loss is established to account for the effects of labeled and unlabeled data, which is subsequently used to direct the backpropagation training. Systematic case studies using the Case Western Reserve University (CWRU) rolling bearing dataset are carried out, in which the effectiveness of this new method is verified by comparing it with other well-established baseline methods. Specifically, nearly all emulation runs using the proposed methodology can lead to around 2%–5% accuracy increase, indicating its robustness in performance enhancement.more » « less
-
Novel machine learning algorithms that make the best use of a significantly less amount of data are of great interest. For example, active learning (AL) aims at addressing this problem by iteratively training a model using a small number of labeled data, testing the whole data on the trained model, and then querying the labels of some selected data, which then are used for training a new model. This paper presents a fast and accurate data selection method, in which the selected samples are optimized to span the subspace of all data. We propose a new selection algorithm, referred to as iterative projection and matching (IPM), with linear complexity w.r.t. the number of data, and without any parameters to be tuned. In our algorithm, at each iteration, the maximum information from the structure of the data is captured by one selected sample, and the captured information is neglected in the next iterations by projection on the null-space of previously selected samples. The computational efficiency and the selection accuracy of our proposed algorithm outperform those of the conventional methods. Furthermore, the superiority of the proposed algorithm is shown on active learning for video action recognition dataset on UCF-101; learning using representatives on ImageNet; training a generative adversarial network (GAN) to generate multi-view images from a single-view input on CMU Multi-PIE dataset; and video summarization on UTE Egocentric dataset.more » « less
-
Puyol Anton, E; Pop, M; Sermesant, M; Campello, V; Lalande, A; Lekadir, K; Suinesiaputra, A; Camara, O; Young, A (Ed.)Cardiac cine magnetic resonance imaging (CMRI) is the reference standard for assessing cardiac structure as well as function. However, CMRI data presents large variations among different centers, vendors, and patients with various cardiovascular diseases. Since typical deep-learning-based segmentation methods are usually trained using a limited number of ground truth annotations, they may not generalize well to unseen MR images, due to the variations between the training and testing data. In this study, we proposed an approach towards building a generalizable deep-learning-based model for cardiac structure segmentations from multi-vendor,multi-center and multi-diseases CMRI data. We used a novel combination of image augmentation and a consistency loss function to improve model robustness to typical variations in CMRI data. The proposed image augmentation strategy leverages un-labeled data by a) using CycleGAN to generate images in different styles and b) exchanging the low-frequency features of images from different vendors. Our model architecture was based on an attention-gated U-Net model that learns to focus on cardiac structures of varying shapes and sizes while suppressing irrelevant regions. The proposed augmentation and consistency training method demonstrated improved performance on CMRI images from new vendors and centers. When evaluated using CMRI data from 4 vendors and 6 clinical center, our method was generally able to produce accurate segmentations of cardiac structures.more » « less