skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 31, 2026

Title: Hi-reComb: constructing recombination maps from bulk gamete Hi-C sequencing
Abstract Recombination is central to genetics and to evolution of sexually reproducing organisms. However, obtaining accurate estimates of recombination rates, and of how they vary along chromosomes, continues to be challenging. To advance our ability to estimate recombination rates, we present Hi-reComb, a new method and software for estimation of recombination maps from bulk gamete chromosome conformation capture sequencing (Hi-C). Simulations show that Hi-reComb produces robust, accurate recombination landscapes. With empirical data from sperm of five fish species we show the advantages of this approach, including joint assessment of recombination maps and large structural variants, map comparisons using bootstrap, and workflows with trio phasing vs. Hi-C phasing. With off-the-shelf library construction and a straightforward rapid workflow, our approach will facilitate routine recombination landscape estimation for a broad range of studies and model organisms in genetics and evolutionary biology. Hi-reComb is open-source and freely available at https://github.com/millanek/Hi-reComb.  more » « less
Award ID(s):
2133740 2243076 2207980
PAR ID:
10646995
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ;
Editor(s):
Lohse, K
Publisher / Repository:
Genetics
Date Published:
Journal Name:
GENETICS
ISSN:
1943-2631
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract MotivationThe exploration of the 3D organization of DNA within the nucleus in relation to various stages of cellular development has led to experiments generating spatiotemporal Hi-C data. However, there is limited spatiotemporal Hi-C data for many organisms, impeding the study of 3D genome dynamics. To overcome this limitation and advance our understanding of genome organization, it is crucial to develop methods for forecasting Hi-C data at future time points from existing timeseries Hi-C data. ResultIn this work, we designed a novel framework named HiCForecast, adopting a dynamic voxel flow algorithm to forecast future spatiotemporal Hi-C data. We evaluated how well our method generalizes forecasting data across different species and systems, ensuring performance in homogeneous, heterogeneous, and general contexts. Using both computational and biological evaluation metrics, our results show that HiCForecast outperforms the current state-of-the-art algorithm, emerging as an efficient and powerful tool for forecasting future spatiotemporal Hi-C datasets. Availability and implementationHiCForecast is publicly available at https://github.com/OluwadareLab/HiCForecast. 
    more » « less
  2. Abstract Hi-C characterizes three-dimensional chromatin organization, facilitates haplotype phasing, and enables genome-assembly scaffolding, but encounters difficulties across complex regions. By coupling chromosome conformation capture (3C) with PacBio HiFilong-read sequencing, here we develop a method (CiFi) that enables analysis of genomic interactions across repetitive regions. Starting with as little as 60,000 cells (sub-microgram DNA), the method produces multi-kilobasepair HiFi reads that contain multiple interacting, concatenated segments (~350 bp to 2 kbp). This multiplicity and increase in segment length versus standard short-read-based Hi-C improves read-mapping efficiency and coverage in repetitive regions and enhances haplotype phasing. CiFi pairwise interactions are largely concordant with Hi-C from a human lymphoblastoid cell line, with gains in assigning topologically associating domains across centromeres, segmental duplications, and human disease-associated genomic hotspots. As CiFi requires less input versus established methods, we apply the approach to characterize single small insects: assaying chromatin interactions across the genome from anAnopheles coluzziimosquito and producing a chromosome-scale scaffolded assembly from aCeratitis capitataMediterranean fruit fly. Together, CiFi enables assessment of chromosome-scale interactions of previously recalcitrant low-complexity loci, low-input samples, and small organisms. 
    more » « less
  3. Abstract MotivationHigh-throughput conformation capture experiments, such as Hi-C provide genome-wide maps of chromatin interactions, enabling life scientists to investigate the role of the three-dimensional structure of genomes in gene regulation and other essential cellular functions. A fundamental problem in the analysis of Hi-C data is how to compare two contact maps derived from Hi-C experiments. Detecting similarities and differences between contact maps are critical in evaluating the reproducibility of replicate experiments and for identifying differential genomic regions with biological significance. Due to the complexity of chromatin conformations and the presence of technology-driven and sequence-specific biases, the comparative analysis of Hi-C data is analytically and computationally challenging. ResultsWe present a novel method called Selfish for the comparative analysis of Hi-C data that takes advantage of the structural self-similarity in contact maps. We define a novel self-similarity measure to design algorithms for (i) measuring reproducibility for Hi-C replicate experiments and (ii) finding differential chromatin interactions between two contact maps. Extensive experimental results on simulated and real data show that Selfish is more accurate and robust than state-of-the-art methods. Availability and implementationhttps://github.com/ucrbioinfo/Selfish 
    more » « less
  4. Abstract The introduction of high-throughput chromosome conformation capture (Hi-C) into metagenomics enables reconstructing high-quality metagenome-assembled genomes (MAGs) from microbial communities. Despite recent advances in recovering eukaryotic, bacterial, and archaeal genomes using Hi-C contact maps, few of Hi-C-based methods are designed to retrieve viral genomes. Here we introduce ViralCC, a publicly available tool to recover complete viral genomes and detect virus-host pairs using Hi-C data. Compared to other Hi-C-based methods, ViralCC leverages the virus-host proximity structure as a complementary information source for the Hi-C interactions. Using mock and real metagenomic Hi-C datasets from several different microbial ecosystems, including the human gut, cow fecal, and wastewater, we demonstrate that ViralCC outperforms existing Hi-C-based binning methods as well as state-of-the-art tools specifically dedicated to metagenomic viral binning. ViralCC can also reveal the taxonomic structure of viruses and virus-host pairs in microbial communities. When applied to a real wastewater metagenomic Hi-C dataset, ViralCC constructs a phage-host network, which is further validated using CRISPR spacer analyses. ViralCC is an open-source pipeline available athttps://github.com/dyxstat/ViralCC. 
    more » « less
  5. Abstract BackgroundInhomogeneous patterns of chromatin-chromatin contacts within 10–100-kb-sized regions of the genome are a generic feature of chromatin spatial organization. These features, termed topologically associating domains (TADs), have led to the loop extrusion factor (LEF) model. Currently, our ability to model TADs relies on the observation that in vertebrates TAD boundaries are correlated with DNA sequences that bind CTCF, which therefore is inferred to block loop extrusion. However, although TADs feature prominently in their Hi-C maps, non-vertebrate eukaryotes either do not express CTCF or show few TAD boundaries that correlate with CTCF sites. In all of these organisms, the counterparts of CTCF remain unknown, frustrating comparisons between Hi-C data and simulations. ResultsTo extend the LEF model across the tree of life, here, we propose theconserved-current loop extrusion (CCLE) modelthat interprets loop-extruding cohesin as a nearly conserved probability current. From cohesin ChIP-seq data alone, we derive a position-dependent loop extrusion rate, allowing for a modified paradigm for loop extrusion, that goes beyond solely localized barriers to also include loop extrusion rates that vary continuously. We show that CCLE accurately predicts the TAD-scale Hi-C maps of interphaseSchizosaccharomyces pombe, as well as those of meiotic and mitoticSaccharomyces cerevisiae, demonstrating its utility in organisms lacking CTCF. ConclusionsThe success of CCLE in yeasts suggests that loop extrusion by cohesin is indeed the primary mechanism underlying TADs in these systems. CCLE allows us to obtain loop extrusion parameters such as the LEF density and processivity, which compare well to independent estimates. 
    more » « less