skip to main content


Title: JOINT SEGMENTATION AND FINE-GRAINED CLASSIFICATION OF NUCLEI IN HISTOPATHOLOGY IMAGES
Nuclei segmentation and classification are two important tasks in the histopathology image analysis, because the mor- phological features of nuclei and spatial distributions of dif- ferent types of nuclei are highly related to cancer diagnosis and prognosis. Existing methods handle the two problems independently, which are not able to obtain the features and spatial heterogeneity of different types of nuclei at the same time. In this paper, we propose a novel deep learning based method which solves both tasks in a unified framework. It can segment individual nuclei and classify them into tumor, lymphocyte and stroma nuclei. Perceptual loss is utilized to enhance the segmentation of details. We also take advantages of transfer learning to promote the training of deep neural net- works on a relatively small lung cancer dataset. Experimental results prove the effectiveness of the proposed method. The code is publicly available  more » « less
Award ID(s):
1747778
NSF-PAR ID:
10105310
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Advances in visual perceptual tasks have been mainly driven by the amount, and types, of annotations of large-scale datasets. Researchers have focused on fully-supervised settings to train models using offline epoch-based schemes. Despite the evident advancements, limitations and cost of manually annotated datasets have hindered further development for event perceptual tasks, such as detection and localization of objects and events in videos. The problem is more apparent in zoological applications due to the scarcity of annotations and length of videos-most videos are at most ten minutes long. Inspired by cognitive theories, we present a self-supervised perceptual prediction framework to tackle the problem of temporal event segmentation by building a stable representation of event-related objects. The approach is simple but effective. We rely on LSTM predictions of high-level features computed by a standard deep learning backbone. For spatial segmentation, the stable representation of the object is used by an attention mechanism to filter the input features before the prediction step. The self-learned attention maps effectively localize the object as a side effect of perceptual prediction. We demonstrate our approach on long videos from continuous wildlife video monitoring, spanning multiple days at 25 FPS. We aim to facilitate automated ethogramming by detecting and localizing events without the need for labels. Our approach is trained in an online manner on streaming input and requires only a single pass through the video, with no separate training set. Given the lack of long and realistic (includes real-world challenges) datasets, we introduce a new wildlife video dataset–nest monitoring of the Kagu (a flightless bird from New Caledonia)–to benchmark our approach. Our dataset features a video from 10 days (over 23 million frames) of continuous monitoring of the Kagu in its natural habitat. We annotate every frame with bounding boxes and event labels. Additionally, each frame is annotated with time-of-day and illumination conditions. We will make the dataset, which is the first of its kind, and the code available to the research community. We find that the approach significantly outperforms other self-supervised, traditional (e.g., Optical Flow, Background Subtraction) and NN-based (e.g., PA-DPC, DINO, iBOT), baselines and performs on par with supervised boundary detection approaches (i.e., PC). At a recall rate of 80%, our best performing model detects one false positive activity every 50 min of training. On average, we at least double the performance of self-supervised approaches for spatial segmentation. Additionally, we show that our approach is robust to various environmental conditions (e.g., moving shadows). We also benchmark the framework on other datasets (i.e., Kinetics-GEBD, TAPOS) from different domains to demonstrate its generalizability. The data and code are available on our project page:https://aix.eng.usf.edu/research_automated_ethogramming.html

     
    more » « less
  2. Abstract

    Prostate cancer treatment decisions rely heavily on subjective visual interpretation [assigning Gleason patterns or International Society of Urological Pathology (ISUP) grade groups] of limited numbers of two‐dimensional (2D) histology sections. Under this paradigm, interobserver variance is high, with ISUP grades not correlating well with outcome for individual patients, and this contributes to the over‐ and undertreatment of patients. Recent studies have demonstrated improved prognostication of prostate cancer outcomes based on computational analyses of glands and nuclei within 2D whole slide images. Our group has also shown that the computational analysis of three‐dimensional (3D) glandular features, extracted from 3D pathology datasets of whole intact biopsies, can allow for improved recurrence prediction compared to corresponding 2D features. Here we seek to expand on these prior studies by exploring the prognostic value of 3D shape‐based nuclear features in prostate cancer (e.g. nuclear size, sphericity). 3D pathology datasets were generated using open‐top light‐sheet (OTLS) microscopy of 102 cancer‐containing biopsies extractedex vivofrom the prostatectomy specimens of 46 patients. A deep learning‐based workflow was developed for 3D nuclear segmentation within the glandular epithelium versus stromal regions of the biopsies. 3D shape‐based nuclear features were extracted, and a nested cross‐validation scheme was used to train a supervised machine classifier based on 5‐year biochemical recurrence (BCR) outcomes. Nuclear features of the glandular epithelium were found to be more prognostic than stromal cell nuclear features (area under the ROC curve [AUC] = 0.72 versus 0.63). 3D shape‐based nuclear features of the glandular epithelium were also more strongly associated with the risk of BCR than analogous 2D features (AUC = 0.72 versus 0.62). The results of this preliminary investigation suggest that 3D shape‐based nuclear features are associated with prostate cancer aggressiveness and could be of value for the development of decision‐support tools. © 2023 The Pathological Society of Great Britain and Ireland.

     
    more » « less
  3. Nuclei segmentation is a fundamental task in histopathological image analysis. Typically, such segmentation tasks require significant effort to manually generate pixel-wise annotations for fully supervised training. To alleviate the manual effort, in this paper we propose a novel approach using points only annotation. Two types of coarse labels with complementary information are derived from the points annotation, and are then utilized to train a deep neural network. The fully- connected conditional random field loss is utilized to further refine the model without introducing extra computational complexity during inference. Experimental results on two nuclei segmentation datasets reveal that the proposed method is able to achieve competitive performance compared to the fully supervised counterpart and the state-of-the-art methods while requiring significantly less annotation effort. Our code is publicly available. 
    more » « less
  4. ABSTRACT Introduction

    Remote military operations require rapid response times for effective relief and critical care. Yet, the military theater is under austere conditions, so communication links are unreliable and subject to physical and virtual attacks and degradation at unpredictable times. Immediate medical care at these austere locations requires semi-autonomous teleoperated systems, which enable the completion of medical procedures even under interrupted networks while isolating the medics from the dangers of the battlefield. However, to achieve autonomy for complex surgical and critical care procedures, robots require extensive programming or massive libraries of surgical skill demonstrations to learn effective policies using machine learning algorithms. Although such datasets are achievable for simple tasks, providing a large number of demonstrations for surgical maneuvers is not practical. This article presents a method for learning from demonstration, combining knowledge from demonstrations to eliminate reward shaping in reinforcement learning (RL). In addition to reducing the data required for training, the self-supervised nature of RL, in conjunction with expert knowledge-driven rewards, produces more generalizable policies tolerant to dynamic environment changes. A multimodal representation for interaction enables learning complex contact-rich surgical maneuvers. The effectiveness of the approach is shown using the cricothyroidotomy task, as it is a standard procedure seen in critical care to open the airway. In addition, we also provide a method for segmenting the teleoperator’s demonstration into subtasks and classifying the subtasks using sequence modeling.

    Materials and Methods

    A database of demonstrations for the cricothyroidotomy task was collected, comprising six fundamental maneuvers referred to as surgemes. The dataset was collected by teleoperating a collaborative robotic platform—SuperBaxter, with modified surgical grippers. Then, two learning models are developed for processing the dataset—one for automatic segmentation of the task demonstrations into a sequence of surgemes and the second for classifying each segment into labeled surgemes. Finally, a multimodal off-policy RL with rewards learned from demonstrations was developed to learn the surgeme execution from these demonstrations.

    Results

    The task segmentation model has an accuracy of 98.2%. The surgeme classification model using the proposed interaction features achieved a classification accuracy of 96.25% averaged across all surgemes compared to 87.08% without these features and 85.4% using a support vector machine classifier. Finally, the robot execution achieved a task success rate of 93.5% compared to baselines of behavioral cloning (78.3%) and a twin-delayed deep deterministic policy gradient with shaped rewards (82.6%).

    Conclusions

    Results indicate that the proposed interaction features for the segmentation and classification of surgical tasks improve classification accuracy. The proposed method for learning surgemes from demonstrations exceeds popular methods for skill learning. The effectiveness of the proposed approach demonstrates the potential for future remote telemedicine on battlefields.

     
    more » « less
  5. As one of the popular deep learning methods, deep convolutional neural networks (DCNNs) have been widely adopted in segmentation tasks and have received positive feedback. However, in segmentation tasks, DCNN-based frameworks are known for their incompetence in dealing with global relations within imaging features. Although several techniques have been proposed to enhance the global reasoning of DCNN, these models are either not able to gain satisfying performances compared with traditional fully-convolutional structures or not capable of utilizing the basic advantages of CNN-based networks (namely the ability of local reasoning). In this study, compared with current attempts to combine FCNs and global reasoning methods, we fully extracted the ability of self-attention by designing a novel attention mechanism for 3D computation and proposed a new segmentation framework (named 3DTU) for three-dimensional medical image segmentation tasks. This new framework processes images in an end-to-end manner and executes 3D computation on both the encoder side (which contains a 3D transformer) and the decoder side (which is based on a 3D DCNN). We tested our framework on two independent datasets that consist of 3D MRI and CT images. Experimental results clearly demonstrate that our method outperforms several state-of-the-art segmentation methods in various metrics. 
    more » « less