skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: DiffGR: Detecting Differentially Interacting Genomic Regions from Hi-C Contact Maps
Recent advances in high-throughput chromosome conformation capture (Hi-C) techniques have allowed us to map genome-wide chromatin interactions and uncover higher-order chromatin structures, thereby shedding light on the principles of genome architecture and functions. However, statistical methods for detecting changes in large-scale chromatin organization such as topologically associating domains (TADs) are still lacking. Here, we proposed a new statistical method, DiffGR, for detecting differentially interacting genomic regions at the TAD level between Hi-C contact maps. We utilized the stratum-adjusted correlation coefficient to measure similarity of local TAD regions. We then developed a nonparametric approach to identify statistically significant changes of genomic interacting regions. Through simulation studies, we demonstrated that DiffGR can robustly and effectively discover differential genomic regions under various conditions. Furthermore, we successfully revealed cell type-specific changes in genomic interacting regions in both human and mouse Hi-C datasets, and illustrated that DiffGR yielded consistent and advantageous results compared with state-of-the-art differential TAD detection methods. The DiffGR R package is published under the GNU General Public License (GPL) ≥ 2 license and is publicly available at https://github.com/wmalab/DiffGR.  more » « less
Award ID(s):
1751317
PAR ID:
10541459
Author(s) / Creator(s):
;
Editor(s):
Zhang, Zhihua
Publisher / Repository:
Oxford Academic
Date Published:
Journal Name:
Genomics, Proteomics & Bioinformatics
Volume:
22
Issue:
2
ISSN:
1672-0229
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract MotivationHigh-throughput conformation capture experiments, such as Hi-C provide genome-wide maps of chromatin interactions, enabling life scientists to investigate the role of the three-dimensional structure of genomes in gene regulation and other essential cellular functions. A fundamental problem in the analysis of Hi-C data is how to compare two contact maps derived from Hi-C experiments. Detecting similarities and differences between contact maps are critical in evaluating the reproducibility of replicate experiments and for identifying differential genomic regions with biological significance. Due to the complexity of chromatin conformations and the presence of technology-driven and sequence-specific biases, the comparative analysis of Hi-C data is analytically and computationally challenging. ResultsWe present a novel method called Selfish for the comparative analysis of Hi-C data that takes advantage of the structural self-similarity in contact maps. We define a novel self-similarity measure to design algorithms for (i) measuring reproducibility for Hi-C replicate experiments and (ii) finding differential chromatin interactions between two contact maps. Extensive experimental results on simulated and real data show that Selfish is more accurate and robust than state-of-the-art methods. Availability and implementationhttps://github.com/ucrbioinfo/Selfish 
    more » « less
  2. Dunbrack, Roland L (Ed.)
    Chromatin is a polymer complex of DNA and proteins that regulates gene expression. The three-dimensional (3D) structure and organization of chromatin controls DNA transcription and replication. High-throughput chromatin conformation capture techniques generate Hi-C maps that can provide insight into the 3D structure of chromatin. Hi-C maps can be represented as a symmetric matrix A i j , where each element represents the average contact probability or number of contacts between chromatin lociiandj. Previous studies have detected topologically associating domains (TADs), or self-interacting regions in A i j within which the contact probability is greater than that outside the region. Many algorithms have been developed to identify TADs within Hi-C maps. However, most TAD identification algorithms are unable to identify nested or overlapping TADs and for a given Hi-C map there is significant variation in the location and number of TADs identified by different methods. We develop a novel method to identify TADs, KerTAD, using a kernel-based technique from computer vision and image processing that is able to accurately identify nested and overlapping TADs. We benchmark this method against state-of-the-art TAD identification methods on both synthetic and experimental data sets. We find that the new method consistently has higher true positive rates (TPR) and lower false discovery rates (FDR) than all tested methods for both synthetic and manually annotated experimental Hi-C maps. The TPR for KerTAD is also largely insensitive to increasing noise and sparsity, in contrast to the other methods. We also find that KerTAD is consistent in the number and size of TADs identified across replicate experimental Hi-C maps for several organisms. Thus, KerTAD will improve automated TAD identification and enable researchers to better correlate changes in TADs to biological phenomena, such as enhancer-promoter interactions and disease states. 
    more » « less
  3. Abstract The organization of chromatin into self-interacting domains is universal among eukaryotic genomes, though how and why they form varies considerably. Here we report a chromosome-scale reference genome assembly of pepper ( Capsicum annuum ) and explore its 3D organization through integrating high-resolution Hi-C maps with epigenomic, transcriptomic, and genetic variation data. Chromatin folding domains in pepper are as prominent as TADs in mammals but exhibit unique characteristics. They tend to coincide with heterochromatic regions enriched with retrotransposons and are frequently embedded in loops, which may correlate with transcription factories. Their boundaries are hotspots for chromosome rearrangements but are otherwise depleted for genetic variation. While chromatin conformation broadly affects transcription variance, it does not predict differential gene expression between tissues. Our results suggest that pepper genome organization is explained by a model of heterochromatin-driven folding promoted by transcription factories and that such spatial architecture is under structural and functional constraints. 
    more » « less
  4. Abstract BackgroundInhomogeneous patterns of chromatin-chromatin contacts within 10–100-kb-sized regions of the genome are a generic feature of chromatin spatial organization. These features, termed topologically associating domains (TADs), have led to the loop extrusion factor (LEF) model. Currently, our ability to model TADs relies on the observation that in vertebrates TAD boundaries are correlated with DNA sequences that bind CTCF, which therefore is inferred to block loop extrusion. However, although TADs feature prominently in their Hi-C maps, non-vertebrate eukaryotes either do not express CTCF or show few TAD boundaries that correlate with CTCF sites. In all of these organisms, the counterparts of CTCF remain unknown, frustrating comparisons between Hi-C data and simulations. ResultsTo extend the LEF model across the tree of life, here, we propose theconserved-current loop extrusion (CCLE) modelthat interprets loop-extruding cohesin as a nearly conserved probability current. From cohesin ChIP-seq data alone, we derive a position-dependent loop extrusion rate, allowing for a modified paradigm for loop extrusion, that goes beyond solely localized barriers to also include loop extrusion rates that vary continuously. We show that CCLE accurately predicts the TAD-scale Hi-C maps of interphaseSchizosaccharomyces pombe, as well as those of meiotic and mitoticSaccharomyces cerevisiae, demonstrating its utility in organisms lacking CTCF. ConclusionsThe success of CCLE in yeasts suggests that loop extrusion by cohesin is indeed the primary mechanism underlying TADs in these systems. CCLE allows us to obtain loop extrusion parameters such as the LEF density and processivity, which compare well to independent estimates. 
    more » « less
  5. null (Ed.)
    Abstract The recently developed Hi-C technique has been widely applied to map genome-wide chromatin interactions. However, current methods for analyzing diploid Hi-C data cannot fully distinguish between homologous chromosomes. Consequently, the existing diploid Hi-C analyses are based on sparse and inaccurate allele-specific contact matrices, which might lead to incorrect modeling of diploid genome architecture. Here we present ASHIC, a hierarchical Bayesian framework to model allele-specific chromatin organizations in diploid genomes. We developed two models under the Bayesian framework: the Poisson-multinomial (ASHIC-PM) model and the zero-inflated Poisson-multinomial (ASHIC-ZIPM) model. The proposed ASHIC methods impute allele-specific contact maps from diploid Hi-C data and simultaneously infer allelic 3D structures. Through simulation studies, we demonstrated that ASHIC methods outperformed existing approaches, especially under low coverage and low SNP density conditions. Additionally, in the analyses of diploid Hi-C datasets in mouse and human, our ASHIC-ZIPM method produced fine-resolution diploid chromatin maps and 3D structures and provided insights into the allelic chromatin organizations and functions. To summarize, our work provides a statistically rigorous framework for investigating fine-scale allele-specific chromatin conformations. The ASHIC software is publicly available at https://github.com/wmalab/ASHIC. 
    more » « less