Discovering anomalous patterns in large digital pathology images

Somanchi, Sriram  (ORCID:0000000231531248); Neill, Daniel B.; Parwani, Anil V.

doi:10.1002/sim.7828

Advances in medical imaging technology have created opportunities for computer‐aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision‐recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100K × 100Kpixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.

More Like this