skip to main content

Title: Quantifying the similarity of topological domains across normal and cancer human cell types
Abstract Motivation

Three-dimensional chromosome structure has been increasingly shown to influence various levels of cellular and genomic functions. Through Hi-C data, which maps contact frequency on chromosomes, it has been found that structural elements termed topologically associating domains (TADs) are involved in many regulatory mechanisms. However, we have little understanding of the level of similarity or variability of chromosome structure across cell types and disease states. In this study, we present a method to quantify resemblance and identify structurally similar regions between any two sets of TADs.


We present an analysis of 23 human Hi-C samples representing various tissue types in normal and cancer cell lines. We quantify global and chromosome-level structural similarity, and compare the relative similarity between cancer and non-cancer cells. We find that cancer cells show higher structural variability around commonly mutated pan-cancer genes than normal cells at these same locations.

Availability and implementation

Software for the methods and analysis can be found at

more » « less
Author(s) / Creator(s):
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Page Range / eLocation ID:
p. i475-i483
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Spatial positioning is a fundamental principle governing nuclear processes. Chromatin is organized as a hierarchy from nucleosomes to Mbp chromatin domains (CD) or topologically associating domains (TADs) to higher level compartments culminating in chromosome territories (CT). Microscopic and sequencing techniques have substantiated chromatin organization as a critical factor regulating gene expression. For example, enhancers loop back to interact with their target genes almost exclusively within TADs, distally located coregulated genes reposition into common transcription factories upon activation, and Mbp CDs exhibit dynamic motion and configurational changes in vivo. A longstanding question in the nucleus field is whether an interactive nuclear matrix provides a direct link between structure and function. The findings of nonrandom radial positioning of CT within the nucleus suggest the possibility of preferential interaction patterns among populations of CT. Sequential labeling up to 10 CT followed by application of computer imaging and geometric graph mining algorithms revealed cell‐type specific interchromosomal networks (ICN) of CT that are altered during the cell cycle, differentiation, and cancer progression. It is proposed that the ICN correlate with the global level of genome regulation. These approaches also demonstrated that the large scale 3‐D topology of CT is specific for each CT. The cell‐type specific proximity of certain chromosomal regions in normal cells may explain the propensity of distinct translocations in cancer subtypes. Understanding how genes are dysregulated upon disruption of the normal “wiring” of the nucleus by translocations, deletions, and amplifications that are hallmarks of cancer, should enable more targeted therapeutic strategies.

    more » « less
  2. Abstract Motivation

    High throughput chromosome conformation capture (Hi-C) contact matrices are used to predict 3D chromatin structures in eukaryotic cells. High-resolution Hi-C data are less available than low-resolution Hi-C data due to sequencing costs but provide greater insight into the intricate details of 3D chromatin structures such as enhancer–promoter interactions and sub-domains. To provide a cost-effective solution to high-resolution Hi-C data collection, deep learning models are used to predict high-resolution Hi-C matrices from existing low-resolution matrices across multiple cell types.


    Here, we present two Cascading Residual Networks called HiCARN-1 and HiCARN-2, a convolutional neural network and a generative adversarial network, that use a novel framework of cascading connections throughout the network for Hi-C contact matrix prediction from low-resolution data. Shown by image evaluation and Hi-C reproducibility metrics, both HiCARN models, overall, outperform state-of-the-art Hi-C resolution enhancement algorithms in predictive accuracy for both human and mouse 1/16, 1/32, 1/64 and 1/100 downsampled high-resolution Hi-C data. Also, validation by extracting topologically associating domains, chromosome 3D structure and chromatin loop predictions from the enhanced data shows that HiCARN can proficiently reconstruct biologically significant regions.

    Availability and implementation

    HiCARN can be accessed and utilized as an open-sourced software at: and is also available as a containerized application that can be run on any platform.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

    more » « less
  3. Infiltration ofCD8+T lymphocytes into solid tumors is associated with good prognosis in various types of cancer, including triple-negative breast cancer (TNBC). However, the mechanisms underlying different infiltration levels are largely unknown. Here, we have characterized the spatial profile ofCD8+T cells around tumor cell clusters (tightly connected tumor cells) in the core and margin regions in TNBC patient samples. We found that in some patients, theCD8+T cell density first decreases when moving in from the boundary of the tumor cell clusters and then rises again when approaching the center. To explain various infiltration profiles, we modeled the dynamics of T cell density via partial differential equations. We spatially modulated the diffusion/chemotactic coefficients of T cells (to mimic physical barriers) or introduced the localized secretion of a diffusing T cell chemorepellent. Combining the spatial-profile analysis and the modeling led to support for the second idea; i.e., there exists a possible chemorepellent inside tumor cell clusters, which preventsCD8+T cells from infiltrating into tumor cell clusters. This conclusion was consistent with an investigation into the properties of collagen fibers which suggested that variations in desmoplastic elements does not limit infiltration ofCD8+T lymphocytes, as we did not observe significant correlations between the level of T cell infiltration and fiber properties. Our work provides evidence thatCD8+T cells can cross typical fibrotic barriers and thus their infiltration into tumor clusters is governed by other mechanisms possibly involving a local repellent.

    more » « less
  4. Abstract

    High-resolution reconstruction of spatial chromosome organizations from chromatin contact maps is highly demanded, but is hindered by extensive pairwise constraints, substantial missing data, and limited resolution and cell-type availabilities. Here, we present FLAMINGO, a computational method that addresses these challenges by compressing inter-dependent Hi-C interactions to delineate the underlying low-rank structures in 3D space, based on the low-rank matrix completion technique. FLAMINGO successfully generates 5 kb- and 1 kb-resolution spatial conformations for all chromosomes in the human genome across multiple cell-types, the largest resources to date. Compared to other methods using various experimental metrics, FLAMINGO consistently demonstrates superior accuracy in recapitulating observed structures with raises in scalability by orders of magnitude. The reconstructed 3D structures efficiently facilitate discoveries of higher-order multi-way interactions, imply biological interpretations of long-range QTLs, reveal geometrical properties of chromatin, and provide high-resolution references to understand structural variabilities. Importantly, FLAMINGO achieves robust predictions against high rates of missing data and significantly boosts 3D structure resolutions. Moreover, FLAMINGO shows vigorous cross cell-type structure predictions that capture cell-type specific spatial configurations via integration of 1D epigenomic signals. FLAMINGO can be widely applied to large-scale chromatin contact maps and expand high-resolution spatial genome conformations for diverse cell-types.

    more » « less
  5. Abstract Motivation

    High-throughput conformation capture experiments, such as Hi-C provide genome-wide maps of chromatin interactions, enabling life scientists to investigate the role of the three-dimensional structure of genomes in gene regulation and other essential cellular functions. A fundamental problem in the analysis of Hi-C data is how to compare two contact maps derived from Hi-C experiments. Detecting similarities and differences between contact maps are critical in evaluating the reproducibility of replicate experiments and for identifying differential genomic regions with biological significance. Due to the complexity of chromatin conformations and the presence of technology-driven and sequence-specific biases, the comparative analysis of Hi-C data is analytically and computationally challenging.


    We present a novel method called Selfish for the comparative analysis of Hi-C data that takes advantage of the structural self-similarity in contact maps. We define a novel self-similarity measure to design algorithms for (i) measuring reproducibility for Hi-C replicate experiments and (ii) finding differential chromatin interactions between two contact maps. Extensive experimental results on simulated and real data show that Selfish is more accurate and robust than state-of-the-art methods.

    Availability and implementation

    more » « less