Abstract MotivationSingle-cell Hi-C (scHi-C) technologies have significantly advanced our understanding of the 3D genome organization. However, scHi-C data are often sparse and noisy, leading to substantial computational challenges in downstream analyses. ResultsIn this study, we introduce SHICEDO, a novel deep-learning model specifically designed to enhance scHi-C contact matrices by imputing missing or sparsely captured chromatin contacts through a generative adversarial framework. SHICEDO leverages the unique structural characteristics of scHi-C matrices to derive customized features that enable effective data enhancement. Additionally, the model incorporates a channel-wise attention mechanism to mitigate the over-smoothing issue commonly associated with scHi-C enhancement methods. Through simulations and real-data applications, we demonstrate that SHICEDO outperforms the state-of-the-art methods, achieving superior quantitative and qualitative results. Moreover, SHICEDO enhances key structural features in scHi-C data, thus enabling more precise delineation of chromatin structures such as A/B compartments, TAD-like domains, and chromatin loops. Availability and implementationSHICEDO is publicly available at https://github.com/wmalab/SHICEDO.
more »
« less
Unicorn: enhancing single-cell Hi-C data with blind super-resolution for 3D genome structure reconstruction
Abstract MotivationSingle-cell Hi-C (scHi-C) data provide critical insights into chromatin interactions at individual cell levels, uncovering unique genomic 3D structures. However, scHi-C datasets are characterized by sparsity and noise, complicating efforts to accurately reconstruct high-resolution chromosomal structures. In this study, we present ScUnicorn, a novel blind super-resolution framework for scHi-C data enhancement. ScUnicorn uses an iterative degradation kernel optimization process, unlike traditional super-resolution approaches, which rely on downsampling, predefined degradation ratios, or constant assumptions about the input data to reconstruct high-resolution interaction matrices. Hence, our approach more reliably preserves critical biological patterns and minimizes noise. Additionally, we propose 3DUnicorn, a maximum likelihood algorithm that leverages the enhanced scHi-C data to infer precise 3D chromosomal structures. ResultsOur evaluation demonstrates that ScUnicorn achieves superior performance over the state-of-the-art methods in terms of Peak Signal-to-Noise Ratio, Structural Similarity Index Measure, and GenomeDisco scores. Moreover, 3DUnicorn’s reconstructed structures align closely with experimental 3D-FISH data, underscoring its biological relevance. Together, ScUnicorn and 3DUnicorn provide a robust framework for advancing genomic research by enhancing scHi-C data fidelity and enabling accurate 3D genome structure reconstruction. Availability and implementationUnicorn implementation is publicly accessible at https://github.com/OluwadareLab/Unicorn.
more »
« less
- Award ID(s):
- 2153205
- PAR ID:
- 10615841
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 41
- Issue:
- Supplement_1
- ISSN:
- 1367-4803
- Format(s):
- Medium: X Size: p. i475-i483
- Size(s):
- p. i475-i483
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Single‐cell chromatin conformation capture (scHi‐C) techniques have evolved to provide significant insights into the structural organization and regulatory mechanisms in individual cells. Although many scHi‐C protocols have been developed, they often involve intricate procedures and the resulting data are sparse, leading to computational challenges for systematic data analysis and limited applicability. This review provides a comprehensive overview, quantitative evaluation of thirteen protocols and practical guidance on computational topics. It is first assessed the efficiency of these protocols based on the total number of contacts recovered per cell and thecis/transratio. It is then provided systematic considerations for scHi‐C quality control and data imputation. Additionally, the capabilities and implementations of various analysis methods, covering cell clustering, A/B compartment calling, topologically associating domain (TAD) calling, loop calling, 3D reconstruction, scHi‐C data simulation and differential interaction analysis is summarized. It is further highlighted key computational challenges associated with the specific complexities of scHi‐C data and propose potential solutions.more » « less
-
Abstract MotivationRecent experiments have provided Hi-C data at resolution as high as 1 kbp. However, 3D structural inference from high-resolution Hi-C datasets is often computationally unfeasible using existing methods. ResultsWe have developed miniMDS, an approximation of multidimensional scaling (MDS) that partitions a Hi-C dataset, performs high-resolution MDS separately on each partition, and then reassembles the partitions using low-resolution MDS. miniMDS is faster, more accurate, and uses less memory than existing methods for inferring the human genome at high resolution (10 kbp). Availability and implementationA Python implementation of miniMDS is available on GitHub: https://github.com/seqcode/miniMDS. Supplementary informationSupplementary data are available at Bioinformatics online.more » « less
-
Abstract High-resolution reconstruction of spatial chromosome organizations from chromatin contact maps is highly demanded, but is hindered by extensive pairwise constraints, substantial missing data, and limited resolution and cell-type availabilities. Here, we present FLAMINGO, a computational method that addresses these challenges by compressing inter-dependent Hi-C interactions to delineate the underlying low-rank structures in 3D space, based on the low-rank matrix completion technique. FLAMINGO successfully generates 5 kb- and 1 kb-resolution spatial conformations for all chromosomes in the human genome across multiple cell-types, the largest resources to date. Compared to other methods using various experimental metrics, FLAMINGO consistently demonstrates superior accuracy in recapitulating observed structures with raises in scalability by orders of magnitude. The reconstructed 3D structures efficiently facilitate discoveries of higher-order multi-way interactions, imply biological interpretations of long-range QTLs, reveal geometrical properties of chromatin, and provide high-resolution references to understand structural variabilities. Importantly, FLAMINGO achieves robust predictions against high rates of missing data and significantly boosts 3D structure resolutions. Moreover, FLAMINGO shows vigorous cross cell-type structure predictions that capture cell-type specific spatial configurations via integration of 1D epigenomic signals. FLAMINGO can be widely applied to large-scale chromatin contact maps and expand high-resolution spatial genome conformations for diverse cell-types.more » « less
-
Abstract BackgroundTo address the limitations of large-scale high quality microscopy image acquisition, PSSR (Point-Scanning Super-Resolution) was introduced to enhance easily acquired low quality microscopy data to a higher quality using deep learning-based methods. However, while PSSR was released as open-source, it was difficult for users to implement into their workflows due to an outdated codebase, limiting its usage by prospective users. Additionally, while the data enhancements provided by PSSR were significant, there was still potential for further improvement. MethodsTo overcome this, we introduce PSSR2, a redesigned implementation of PSSR workflows and methods built to put state-of-the-art technology into the hands of the general microscopy and biology research community. PSSR2 enables user-friendly implementation of super-resolution workflows for simultaneous super-resolution and denoising of undersampled microscopy data, especially through its integrated Command Line Interface and Napari plugin. PSSR2 improves and expands upon previously established PSSR algorithms, mainly through improvements in the semi-synthetic data generation (“crappification”) and training processes. ResultsIn benchmarking PSSR2 on a test dataset of paired high and low resolution electron microscopy images, PSSR2 super-resolves high-resolution images from low-resolution images to a significantly higher accuracy than PSSR. The super-resolved images are also more visually representative of real-world high-resolution images. DiscussionThe improvements in PSSR2, in providing higher quality images, should improve the performance of downstream analyses. We note that for accurate super-resolution, PSSR2 models should only be applied to super-resolve data sufficiently similar to training data and should be validated against real-world ground truth data.more » « less
An official website of the United States government
