Advances in medical imaging technology have created opportunities for computer‐aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision‐recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively more »
- Publication Date:
- NSF-PAR ID:
- 10060361
- Journal Name:
- Statistics in Medicine
- Volume:
- 37
- Issue:
- 25
- Page Range or eLocation-ID:
- p. 3599-3615
- ISSN:
- 0277-6715
- Publisher:
- Wiley Blackwell (John Wiley & Sons)
- Sponsoring Org:
- National Science Foundation
More Like this
-
Obeid, I. (Ed.)The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients with cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do notmore »
-
Obeid, Iyad ; Picone, Joseph ; Selesnick, Ivan (Ed.)The Neural Engineering Data Consortium (NEDC) is developing a large open source database of high-resolution digital pathology images known as the Temple University Digital Pathology Corpus (TUDP) [1]. Our long-term goal is to release one million images. We expect to release the first 100,000 image corpus by December 2020. The data is being acquired at the Department of Pathology at Temple University Hospital (TUH) using a Leica Biosystems Aperio AT2 scanner [2] and consists entirely of clinical pathology images. More information about the data and the project can be found in Shawki et al. [3]. We currently have a National Science Foundation (NSF) planning grant [4] to explore how best the community can leverage this resource. One goal of this poster presentation is to stimulate community-wide discussions about this project and determine how this valuable resource can best meet the needs of the public. The computing infrastructure required to support this database is extensive [5] and includes two HIPAA-secure computer networks, dual petabyte file servers, and Aperio’s eSlide Manager (eSM) software [6]. We currently have digitized over 50,000 slides from 2,846 patients and 2,942 clinical cases. There is an average of 12.4 slides per patient and 10.5 slides per casemore »
-
Abstract Pollen is used to investigate a diverse range of ecological problems, from identifying plant–pollinator relationships to tracking flowering phenology. Pollen types are identified according to a set of distinctive morphological characters which are understood to capture taxonomic differences and phylogenetic relationships among taxa. However, categorizing morphological variation among hyperdiverse pollen samples represents a challenge even for an expert analyst.
We present an automated workflow for pollen analysis, from the automated scanning of pollen sample slides to the automated detection and identification of pollen taxa using convolutional neural networks (CNNs). We analysed aerial pollen samples from lowland Panama and used a microscope slide scanner to capture three‐dimensional representations of 150 sample slides. These pollen sample images were annotated by an expert using a virtual microscope. Metadata were digitally recorded for ~100 pollen grains per slide, including location, identification and the analyst's confidence of the given identification. We used these annotated images to train and test our detection and classification CNN models. Our approach is two‐part. We first compared three methods for training CNN models to detect pollen grains on a palynological slide. We next investigated approaches to training CNN models for pollen identification.
Because the diversity of pollen taxa in environmental andmore »
Pollen represents a challenging visual classification problem that can serve as a model for other areas of biology that rely on visual identification. Our results add to the body of research demonstrating the potential for a fully automated pollen classification system for environmental and palaeontological samples. Slide imaging, pollen detection and specimen identification can be automated to produce a streamlined workflow.
-
Automatic histopathological Whole Slide Image (WSI) analysis for cancer classification has been highlighted along with the advancements in microscopic imaging techniques, since manual examination and diagnosis with WSIs are time- and cost-consuming. Recently, deep convolutional neural networks have succeeded in histopathological image analysis. However, despite the success of the development, there are still opportunities for further enhancements. In this paper, we propose a novel cancer texture-based deep neural network (CAT-Net) that learns scalable morphological features from histopathological WSIs. The innovation of CAT-Net is twofold: (1) capturing invariant spatial patterns by dilated convolutional layers and (2) improving predictive performance while reducing model complexity. Moreover, CAT-Net can provide discriminative morphological (texture) patterns formed on cancerous regions of histopathological images comparing to normal regions. We elucidated how our proposed method, CAT-Net, captures morphological patterns of interest in hierarchical levels in the model. The proposed method out-performed the current state-of-the-art benchmark methods on accuracy, precision, recall, and F1 score.
-
Introduction: Computed tomography perfusion (CTP) imaging requires injection of an intravenous contrast agent and increased exposure to ionizing radiation. This process can be lengthy, costly, and potentially dangerous to patients, especially in emergency settings. We propose MAGIC, a multitask, generative adversarial network-based deep learning model to synthesize an entire CTP series from only a non-contrasted CT (NCCT) input. Materials and Methods: NCCT and CTP series were retrospectively retrieved from 493 patients at UF Health with IRB approval. The data were deidentified and all images were resized to 256x256 pixels. The collected perfusion data were analyzed using the RapidAI CT Perfusion analysis software (iSchemaView, Inc. CA) to generate each CTP map. For each subject, 10 CTP slices were selected. Each slice was paired with one NCCT slice at the same location and two NCCT slices at a predefined vertical offset, resulting in 4.3K CTP images and 12.9K NCCT images used for training. The incorporation of a spatial offset into the NCCT input allows MAGIC to more accurately synthesize cerebral perfusive structures, increasing the quality of the generated images. The studies included a variety of indications, including healthy tissue, mild infarction, and severe infarction. The proposed MAGIC model incorporates a novel multitaskmore »