skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Gaussian graphical model‐based heterogeneity analysis via penalized fusion
Abstract Heterogeneity is a hallmark of cancer, diabetes, cardiovascular diseases, and many other complex diseases. This study has been partly motivated by the unsupervised heterogeneity analysis for complex diseases based on molecular and imaging data, for which, network‐based analysis, by accommodating the interconnections among variables, can be more informative than that limited to mean, variance, and other simple distributional properties. In the literature, there has been very limited research on network‐based heterogeneity analysis, and a common limitation shared by the existing techniques is that the number of subgroups needs to be specified a priori or in an ad hoc manner. In this article, we develop a penalized fusion approach for heterogeneity analysis based on the Gaussian graphical model. It applies penalization to the mean and precision matrix parameters to generate regularized and interpretable estimates. More importantly, a fusion penalty is imposed to “automatedly” determine the number of subgroups and generate more concise, reliable, and interpretable estimation. Consistency properties are rigorously established, and an effective computational algorithm is developed. The heterogeneity analysis of non‐small‐cell lung cancer based on single‐cell gene expression data of the Wnt pathway and that of lung adenocarcinoma based on histopathological imaging data not only demonstrate the practical applicability of the proposed approach but also lead to interesting new findings.  more » « less
Award ID(s):
1916251
PAR ID:
10368804
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrics
Volume:
78
Issue:
2
ISSN:
0006-341X
Format(s):
Medium: X Size: p. 524-535
Size(s):
p. 524-535
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Heterogeneity is a hallmark of cancer. For various cancer outcomes/phenotypes, supervised heterogeneity analysis has been conducted, leading to a deeper understanding of disease biology and customized clinical decisions. In the literature, such analysis has been oftentimes based on demographic, clinical, and omics measurements. Recent studies have shown that high-dimensional histopathological imaging features contain valuable information on cancer outcomes. However, comparatively, heterogeneity analysis based on imaging features has been very limited. In this article, we conduct supervised cancer heterogeneity analysis using histopathological imaging features. The penalized fusion technique, which has notable advantages-such as greater flexibility-over the finite mixture modeling and other techniques, is adopted. A sparse penalization is further imposed to accommodate high dimensionality and select relevant imaging features. To improve computational feasibility and generate more reliable estimation, we employ model averaging. Computational and statistical properties of the proposed approach are carefully investigated. Simulation demonstrates its favorable performance. The analysis of The Cancer Genome Atlas (TCGA) data may provide a new way of defining/examining breast cancer heterogeneity. 
    more » « less
  2. Heterogeneity is a hallmark of many complex diseases. There are multiple ways of defining heterogeneity, among which the heterogeneity in genetic regulations, for example, gene expressions (GEs) by copy number variations (CNVs), and methylation, has been suggested but little investigated. Heterogeneity in genetic regulations can be linked with disease severity, progression, and other traits and is biologically important. However, the analysis can be very challenging with the high dimensionality of both sides of regulation as well as sparse and weak signals. In this article, we consider the scenario where subjects form unknown subgroups, and each subgroup has unique genetic regulation relationships. Further, such heterogeneity is “guided” by a known biomarker. We develop a multivariate sparse fusion (MSF) approach, which innovatively applies the penalized fusion technique to simultaneously determine the number and structure of subgroups and regulation relationships within each subgroup. An effective computational algorithm is developed, and extensive simulations are conducted. The analysis of heterogeneity in the GE‐CNV regulations in melanoma and GE‐methylation regulations in stomach cancer using the TCGA data leads to interesting findings. 
    more » « less
  3. Abstract In cancer research, supervised heterogeneity analysis has important implications. Such analysis has been traditionally based on clinical/demographic/molecular variables. Recently, histopathological imaging features, which are generated as a byproduct of biopsy, have been shown as effective for modeling cancer outcomes, and a handful of supervised heterogeneity analysis has been conducted based on such features. There are two types of histopathological imaging features, which are extracted based on specific biological knowledge and using automated imaging processing software, respectively. Usingbothtypes of histopathological imaging features, our goal is to conduct the first supervised cancer heterogeneity analysisthat satisfies a hierarchical structure. That is, the first type of imaging features defines a rough structure, and the second type defines a nested and more refined structure. A penalization approach is developed, which has been motivated by but differs significantly from penalized fusion and sparse group penalization. It has satisfactory statistical and numerical properties. In the analysis of lung adenocarcinoma data, it identifies a heterogeneity structure significantly different from the alternatives and has satisfactory prediction and stability performance. 
    more » « less
  4. Wren, Jonathan (Ed.)
    Abstract Summary Heterogeneity is a hallmark of many complex human diseases, and unsupervised heterogeneity analysis has been extensively conducted using high-throughput molecular measurements and histopathological imaging features. ‘Classic’ heterogeneity analysis has been based on simple statistics such as mean, variance and correlation. Network-based analysis takes interconnections as well as individual variable properties into consideration and can be more informative. Several Gaussian graphical model (GGM)-based heterogeneity analysis techniques have been developed, but friendly and portable software is still lacking. To facilitate more extensive usage, we develop the R package HeteroGGM, which conducts GGM-based heterogeneity analysis using the advanced penaliztaion techniques, can provide informative summary and graphical presentation, and is efficient and friendly. Availabilityand implementation The package is available at https://CRAN.R-project.org/package=HeteroGGM. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  5. Lung diseases such as cancer substantially alter the mechanical properties of the organ with direct impact on the development, progression, diagnosis, and treatment response of diseases. Despite significant interest in the lung’s material properties, measuring the stiffness of intact lungs at sub-alveolar resolution has not been possible. Recently, we developed the crystal ribcage to image functioning lungs at optical resolution while controlling physiological parameters such as air pressure. Here, we introduce a data-driven, multiscale network model that takes images of the lung at different distending pressures, acquired via the crystal ribcage, and produces corresponding absolute stiffness maps. Following validation, we report absolute stiffness maps of the functioning lung at microscale resolution in health and disease. For representative images of a healthy lung and a lung with primary cancer, we find that while the lung exhibits significant stiffness heterogeneity at the microscale, primary tumors introduce even greater heterogeneity into the lung’s microenvironment. Additionally, we observe that while the healthy alveoli exhibit strain-stiffening of ∼1.75 times, the tumor’s stiffness increases by a factor of six across the range of measured transpulmonary pressures. While the tumor stiffness is 1.4 times the lung stiffness at a transpulmonary pressure of three cmH2O, the tumor’s mean stiffness is nearly five times greater than that of the surrounding tissue at a transpulmonary pressure of 18 cmH2O. Finally, we report that the variance in both strain and stiffness increases with transpulmonary pressure in both the healthy and cancerous lungs. Our new method allows quantitative assessment of disease-induced stiffness changes in the alveoli with implications for mechanotransduction. 
    more » « less