skip to main content


Title: Bayesian finite mixture of regression analysis for cancer based on histopathological imaging–environment interactions
Summary

Cancer is a heterogeneous disease. Finite mixture of regression (FMR)—as an important heterogeneity analysis technique when an outcome variable is present—has been extensively employed in cancer research, revealing important differences in the associations between a cancer outcome/phenotype and covariates. Cancer FMR analysis has been based on clinical, demographic, and omics variables. A relatively recent and alternative source of data comes from histopathological images. Histopathological images have been long used for cancer diagnosis and staging. Recently, it has been shown that high-dimensional histopathological image features, which are extracted using automated digital image processing pipelines, are effective for modeling cancer outcomes/phenotypes. Histopathological imaging–environment interaction analysis has been further developed to expand the scope of cancer modeling and histopathological imaging-based analysis. Motivated by the significance of cancer FMR analysis and a still strong demand for more effective methods, in this article, we take the natural next step and conduct cancer FMR analysis based on models that incorporate low-dimensional clinical/demographic/environmental variables, high-dimensional imaging features, as well as their interactions. Complementary to many of the existing studies, we develop a Bayesian approach for accommodating high dimensionality, screening out noises, identifying signals, and respecting the “main effects, interactions” variable selection hierarchy. An effective computational algorithm is developed, and simulation shows advantageous performance of the proposed approach. The analysis of The Cancer Genome Atlas data on lung squamous cell cancer leads to interesting findings different from the alternative approaches.

 
more » « less
Award ID(s):
1916251
NSF-PAR ID:
10380963
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biostatistics
Volume:
24
Issue:
2
ISSN:
1465-4644
Format(s):
Medium: X Size: p. 425-442
Size(s):
["p. 425-442"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    In cancer research, supervised heterogeneity analysis has important implications. Such analysis has been traditionally based on clinical/demographic/molecular variables. Recently, histopathological imaging features, which are generated as a byproduct of biopsy, have been shown as effective for modeling cancer outcomes, and a handful of supervised heterogeneity analysis has been conducted based on such features. There are two types of histopathological imaging features, which are extracted based on specific biological knowledge and using automated imaging processing software, respectively. Usingbothtypes of histopathological imaging features, our goal is to conduct the first supervised cancer heterogeneity analysisthat satisfies a hierarchical structure. That is, the first type of imaging features defines a rough structure, and the second type defines a nested and more refined structure. A penalization approach is developed, which has been motivated by but differs significantly from penalized fusion and sparse group penalization. It has satisfactory statistical and numerical properties. In the analysis of lung adenocarcinoma data, it identifies a heterogeneity structure significantly different from the alternatives and has satisfactory prediction and stability performance.

     
    more » « less
  2. null (Ed.)
    Heterogeneity is a hallmark of cancer. For various cancer outcomes/phenotypes, supervised heterogeneity analysis has been conducted, leading to a deeper understanding of disease biology and customized clinical decisions. In the literature, such analysis has been oftentimes based on demographic, clinical, and omics measurements. Recent studies have shown that high-dimensional histopathological imaging features contain valuable information on cancer outcomes. However, comparatively, heterogeneity analysis based on imaging features has been very limited. In this article, we conduct supervised cancer heterogeneity analysis using histopathological imaging features. The penalized fusion technique, which has notable advantages-such as greater flexibility-over the finite mixture modeling and other techniques, is adopted. A sparse penalization is further imposed to accommodate high dimensionality and select relevant imaging features. To improve computational feasibility and generate more reliable estimation, we employ model averaging. Computational and statistical properties of the proposed approach are carefully investigated. Simulation demonstrates its favorable performance. The analysis of The Cancer Genome Atlas (TCGA) data may provide a new way of defining/examining breast cancer heterogeneity. 
    more » « less
  3. Maize (Zea mays L.) is one of the three major cereal crops in the world. Leaf angle is an important architectural trait of crops due to its substantial role in light interception by the canopy and hence photosynthetic efficiency. Traditionally, leaf angle has been measured using a protractor, a process that is both slow and laborious. Efficiently measuring leaf angle under field conditions via imaging is challenging due to leaf density in the canopy and the resulting occlusions. However, advances in imaging technologies and machine learning have provided new tools for image acquisition and analysis that could be used to characterize leaf angle using three-dimensional (3D) models of field-grown plants. In this study, PhenoBot 3.0, a robotic vehicle designed to traverse between pairs of agronomically spaced rows of crops, was equipped with multiple tiers of PhenoStereo cameras to capture side-view images of maize plants in the field. PhenoStereo is a customized stereo camera module with integrated strobe lighting for high-speed stereoscopic image acquisition under variable outdoor lighting conditions. An automated image processing pipeline (AngleNet) was developed to measure leaf angles of nonoccluded leaves. In this pipeline, a novel representation form of leaf angle as a triplet of keypoints was proposed. The pipeline employs convolutional neural networks to detect each leaf angle in two-dimensional images and 3D modeling approaches to extract quantitative data from reconstructed models. Our study demonstrates the feasibility of using stereo vision to investigate the distribution of leaf angles in maize under field conditions. The proposed system is an efficient alternative to traditional leaf angle phenotyping and thus could accelerate breeding for improved plant architecture. 
    more » « less
  4. Abstract

    Heterogeneity is a hallmark of cancer, diabetes, cardiovascular diseases, and many other complex diseases. This study has been partly motivated by the unsupervised heterogeneity analysis for complex diseases based on molecular and imaging data, for which, network‐based analysis, by accommodating the interconnections among variables, can be more informative than that limited to mean, variance, and other simple distributional properties. In the literature, there has been very limited research on network‐based heterogeneity analysis, and a common limitation shared by the existing techniques is that the number of subgroups needs to be specified a priori or in an ad hoc manner. In this article, we develop a penalized fusion approach for heterogeneity analysis based on the Gaussian graphical model. It applies penalization to the mean and precision matrix parameters to generate regularized and interpretable estimates. More importantly, a fusion penalty is imposed to “automatedly” determine the number of subgroups and generate more concise, reliable, and interpretable estimation. Consistency properties are rigorously established, and an effective computational algorithm is developed. The heterogeneity analysis of non‐small‐cell lung cancer based on single‐cell gene expression data of the Wnt pathway and that of lung adenocarcinoma based on histopathological imaging data not only demonstrate the practical applicability of the proposed approach but also lead to interesting new findings.

     
    more » « less
  5. Automatic histopathological Whole Slide Image (WSI) analysis for cancer classification has been highlighted along with the advancements in microscopic imaging techniques, since manual examination and diagnosis with WSIs are time- and cost-consuming. Recently, deep convolutional neural networks have succeeded in histopathological image analysis. However, despite the success of the development, there are still opportunities for further enhancements. In this paper, we propose a novel cancer texture-based deep neural network (CAT-Net) that learns scalable morphological features from histopathological WSIs. The innovation of CAT-Net is twofold: (1) capturing invariant spatial patterns by dilated convolutional layers and (2) improving predictive performance while reducing model complexity. Moreover, CAT-Net can provide discriminative morphological (texture) patterns formed on cancerous regions of histopathological images comparing to normal regions. We elucidated how our proposed method, CAT-Net, captures morphological patterns of interest in hierarchical levels in the model. The proposed method out-performed the current state-of-the-art benchmark methods on accuracy, precision, recall, and F1 score. 
    more » « less