skip to main content


Title: Principal curve approaches for inferring 3D chromatin architecture
Summary Three-dimensional (3D) genome spatial organization is critical for numerous cellular processes, including transcription, while certain conformation-driven structural alterations are frequently oncogenic. Genome architecture had been notoriously difficult to elucidate, but the advent of the suite of chromatin conformation capture assays, notably Hi-C, has transformed understanding of chromatin structure and provided downstream biological insights. Although many findings have flowed from direct analysis of the pairwise proximity data produced by these assays, there is added value in generating corresponding 3D reconstructions deriving from superposing genomic features on the reconstruction. Accordingly, many methods for inferring 3D architecture from proximity data have been advanced. However, none of these approaches exploit the fact that single chromosome solutions constitute a one-dimensional (1D) curve in 3D. Rather, this aspect has either been addressed by imposition of constraints, which is both computationally burdensome and cell type specific, or ignored with contiguity imposed after the fact. Here, we target finding a 1D curve by extending principal curve methodology to the metric scaling problem. We illustrate how this approach yields a sequence of candidate solutions, indexed by an underlying smoothness or degrees-of-freedom parameter, and propose methods for selection from this sequence. We apply the methodology to Hi-C data obtained on IMR90 cells and so are positioned to evaluate reconstruction accuracy by referencing orthogonal imaging data. The results indicate the utility and reproducibility of our principal curve approach in the face of underlying structural variation.  more » « less
Award ID(s):
2013736
NSF-PAR ID:
10226370
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Biostatistics
ISSN:
1465-4644
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The structure and dynamics of the eukaryotic genome are intimately linked to gene regulation and transcriptional activity. Many chromosome conformation capture experiments like Hi-C have been developed to detect genome-wide contact frequencies and quantify loop/compartment structures for different cellular contexts and time-dependent processes. However, a full understanding of these events requires explicit descriptions of representative chromatin and chromosome configurations. With the exponentially growing amount of data from Hi-C experiments, many methods for deriving 3D structures from contact frequency data have been developed. Yet, most reconstruction methods use polymer models with low resolution to predict overall genome structure. Here we present a Brownian Dynamics (BD) approach termed Hi-BDiSCO for producing 3D genome structures from Hi-C and Micro-C data using our mesoscale-resolution chromatin model based on the Discrete Surface Charge Optimization (DiSCO) model. Our approach integrates reconstruction with chromatin simulations at nucleosome resolution with appropriate biophysical parameters. Following a description of our protocol, we present applications to the NXN, HOXC, HOXA and Fbn2 mouse genes ranging in size from 50 to 100 kb. Such nucleosome-resolution genome structures pave the way for pursuing many biomedical applications related to the epigenomic regulation of chromatin and control of human disease.

     
    more » « less
  2. Abstract

    High-resolution reconstruction of spatial chromosome organizations from chromatin contact maps is highly demanded, but is hindered by extensive pairwise constraints, substantial missing data, and limited resolution and cell-type availabilities. Here, we present FLAMINGO, a computational method that addresses these challenges by compressing inter-dependent Hi-C interactions to delineate the underlying low-rank structures in 3D space, based on the low-rank matrix completion technique. FLAMINGO successfully generates 5 kb- and 1 kb-resolution spatial conformations for all chromosomes in the human genome across multiple cell-types, the largest resources to date. Compared to other methods using various experimental metrics, FLAMINGO consistently demonstrates superior accuracy in recapitulating observed structures with raises in scalability by orders of magnitude. The reconstructed 3D structures efficiently facilitate discoveries of higher-order multi-way interactions, imply biological interpretations of long-range QTLs, reveal geometrical properties of chromatin, and provide high-resolution references to understand structural variabilities. Importantly, FLAMINGO achieves robust predictions against high rates of missing data and significantly boosts 3D structure resolutions. Moreover, FLAMINGO shows vigorous cross cell-type structure predictions that capture cell-type specific spatial configurations via integration of 1D epigenomic signals. FLAMINGO can be widely applied to large-scale chromatin contact maps and expand high-resolution spatial genome conformations for diverse cell-types.

     
    more » « less
  3. Summary

    A gene may be controlled by distal enhancers and repressors, not merely by regulatory elements in its promoter. Spatial organization of chromosomes is the mechanism that brings genes and their distal regulatory elements into close proximity. Recent molecular techniques, coupled with Next Generation Sequencing (NGS) technology, enable genome-wide detection of physical contacts between distant genomic loci. In particular, Hi-C is an NGS-aided assay for the study of genome-wide spatial interactions. The availability of such data makes it possible to reconstruct the underlying three-dimensional (3D) spatial chromatin structure. In this article, we present the Poisson Random effect Architecture Model (PRAM) for such an inference. The main feature of PRAM that separates it from previous methods is that it addresses the issue of over-dispersion and takes correlations among contact counts into consideration, thereby achieving greater consistency with observed data. PRAM was applied to Hi-C data to illustrate its performance and to compare the predicted distances with those measured by a Fluorescence In Situ Hybridization (FISH) validation experiment. Further, PRAM was compared to other methods in the literature based on both real and simulated data.

     
    more » « less
  4. Abstract Background

    B-type lamins are critical nuclear envelope proteins that interact with the three-dimensional genomic architecture. However, identifying the direct roles of B-lamins on dynamic genome organization has been challenging as their joint depletion severely impacts cell viability. To overcome this, we engineered mammalian cells to rapidly and completely degrade endogenous B-type lamins using Auxin-inducible degron technology.

    Results

    Using live-cell Dual Partial Wave Spectroscopic (Dual-PWS) microscopy, Stochastic Optical Reconstruction Microscopy (STORM), in situ Hi-C, CRISPR-Sirius, and fluorescence in situ hybridization (FISH), we demonstrate that lamin B1 and lamin B2 are critical structural components of the nuclear periphery that create a repressive compartment for peripheral-associated genes. Lamin B1 and lamin B2 depletion minimally alters higher-order chromatin folding but disrupts cell morphology, significantly increases chromatin mobility, redistributes both constitutive and facultative heterochromatin, and induces differential gene expression both within and near lamin-associated domain (LAD) boundaries. Critically, we demonstrate that chromatin territories expand as upregulated genes within LADs radially shift inwards. Our results indicate that the mechanism of action of B-type lamins comes from their role in constraining chromatin motion and spatial positioning of gene-specific loci, heterochromatin, and chromatin domains.

    Conclusions

    Our findings suggest that, while B-type lamin degradation does not significantly change genome topology, it has major implications for three-dimensional chromatin conformation at the single-cell level both at the lamina-associated periphery and the non-LAD-associated nuclear interior with concomitant genome-wide transcriptional changes. This raises intriguing questions about the individual and overlapping roles of lamin B1 and lamin B2 in cellular function and disease.

     
    more » « less
  5. Abstract Motivation

    High-throughput conformation capture experiments, such as Hi-C provide genome-wide maps of chromatin interactions, enabling life scientists to investigate the role of the three-dimensional structure of genomes in gene regulation and other essential cellular functions. A fundamental problem in the analysis of Hi-C data is how to compare two contact maps derived from Hi-C experiments. Detecting similarities and differences between contact maps are critical in evaluating the reproducibility of replicate experiments and for identifying differential genomic regions with biological significance. Due to the complexity of chromatin conformations and the presence of technology-driven and sequence-specific biases, the comparative analysis of Hi-C data is analytically and computationally challenging.

    Results

    We present a novel method called Selfish for the comparative analysis of Hi-C data that takes advantage of the structural self-similarity in contact maps. We define a novel self-similarity measure to design algorithms for (i) measuring reproducibility for Hi-C replicate experiments and (ii) finding differential chromatin interactions between two contact maps. Extensive experimental results on simulated and real data show that Selfish is more accurate and robust than state-of-the-art methods.

    Availability and implementation

    https://github.com/ucrbioinfo/Selfish

     
    more » « less