skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Nucleome Data Bank: web-based resources to simulate and analyze the three-dimensional genome
Abstract We introduce the Nucleome Data Bank (NDB), a web-based platform to simulate and analyze the three-dimensional (3D) organization of genomes. The NDB enables physics-based simulation of chromosomal structural dynamics through the MEGABASE + MiChroM computational pipeline. The input of the pipeline consists of epigenetic information sourced from the Encode database; the output consists of the trajectories of chromosomal motions that accurately predict Hi-C and fluorescence insitu hybridization data, as well as multiple observations of chromosomal dynamics in vivo. As an intermediate step, users can also generate chromosomal sub-compartment annotations directly from the same epigenetic input, without the use of any DNA–DNA proximity ligation data. Additionally, the NDB freely hosts both experimental and computational structural genomics data. Besides being able to perform their own genome simulations and download the hosted data, users can also analyze and visualize the same data through custom-designed web-based tools. In particular, the one-dimensional genetic and epigenetic data can be overlaid onto accurate 3D structures of chromosomes, to study the spatial distribution of genetic and epigenetic features. The NDB aims to be a shared resource to biologists, biophysicists and all genome scientists. The NDB is available at https://ndb.rice.edu.  more » « less
Award ID(s):
2019745
PAR ID:
10233282
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Nucleic Acids Research
Volume:
49
Issue:
D1
ISSN:
0305-1048
Page Range / eLocation ID:
D172 to D182
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Numerous intra- and inter-chromosomal contacts have been mapped in eukaryotic genomes, but it remains challenging to link these 3D structures to their regulatory functions. To establish the causal relationships between chromosome conformation and genome functions, we  develop a method, Chemically Induced Chromosomal Interaction (CICI), to selectively perturb the chromosome conformation at targeted loci. Using this method, long-distance chromosomal interactions can be induced dynamically between intra- or inter-chromosomal loci pairs, including the ones with very low Hi-C contact frequencies. Measurement of CICI formation time allows us to probe chromosome encounter dynamics between different loci pairs across the cell cycle. We also conduct two functional tests of CICI. We perturb the chromosome conformation near a DNA double-strand break and observe altered donor preference in homologous recombination; we force interactions between early and late-firing DNA replication origins and find no significant changes in replication timing. These results suggest that chromosome conformation plays a deterministic role in homology-directed DNA repair, but not in the establishment of replication timing. Overall, our study demonstrates that CICI is a powerful tool to study chromosome dynamics and 3D genome function. 
    more » « less
  2. Abstract Mobile genetic elements (MGEs) are transient genetic material that can move either within a single organism's genome or between individuals or species. While historically considered “junk” DNA (i.e., deleterious or at best neutral), more recent studies reveal the potential adaptive advantages MGEs provide in lineages across the tree of life. Ciliates, a group of single‐celled microbial eukaryotes characterized by nuclear dimorphism, exemplify how epigenetic influences from MGEs shape genome architecture and patterns of molecular evolution. Ciliate nuclear dimorphism may have evolved as a response to transposon invasion and ciliates have since co‐opted transposons to carry out programmed DNA deletion. Another example of the effect of MGEs is in providing mechanisms for lateral gene transfer (LGT) from bacteria, which introduces genetic diversity and, in several cases, may drive ecological specialization in ciliates. As a third example, the integration of viral DNA, likely through transduction, provides new genetic materials and can change the way host cells defend themselves against other viral pathogens. We argue that the acquisition of MGEs through non‐Mendelian patterns of inheritance, coupled with their effects on ciliate genome architecture and persistence throughout evolutionary history, exemplify how the transmission of mobile elements should be considered a mechanism of transgenerational epigenetic inheritance. 
    more » « less
  3. Nielsen, Rasmus (Ed.)
    Abstract Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome data sets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate data sets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in >20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This data set, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental metadata. A web-based genome browser and web portal provide easy access to the SNP data set. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan data set. Our resource will enable population geneticists to analyze spatiotemporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail. 
    more » « less
  4. Abstract The HIV reservoir consists of infected cells in which the HIV-1 genome persists as provirus despite effective antiretroviral therapy (ART). Studies exploring HIV cure therapies often measure intact proviral DNA levels, time to rebound after ART interruption, or ex vivo stimulation assays of latently infected cells. This study utilizes barcoded HIV to analyze the reservoir in humanized mice. Using bulk PCR and deep sequencing methodologies, we retrieve 890 viral RNA barcodes and 504 proviral barcodes linked to 15,305 integration sites at the single RNA or DNA molecule in vivo. We track viral genetic diversity throughout early infection, ART, and rebound. The proviral reservoir retains genetic diversity despite cellular clonal proliferation and viral seeding by rebounding virus. Non-proliferated cell clones are likely the result of elimination of proviruses associated with transcriptional activation and viremia. Elimination of proviruses associated with viremia is less prominent among proliferated cell clones. Proliferated, but not massively expanded, cell clones contribute to proviral expansion and viremia, suggesting they fuel viral persistence. This approach enables comprehensive assessment of viral levels, lineages, integration sites, clonal proliferation and proviral epigenetic patterns in vivo. These findings highlight complex reservoir dynamics and the role of proliferated cell clones in viral persistence. 
    more » « less
  5. Abstract Increased ecological disturbances, species invasions, and climate change are creating severe conservation problems for several plant species that are widespread and foundational. Understanding the genetic diversity of these species and how it relates to adaptation to these stressors are necessary for guiding conservation and restoration efforts. This need is particularly acute for big sagebrush (Artemisia tridentata; Asteraceae), which was once the dominant shrub over 1,000,000 km2 in western North America but has since retracted by half and thus has become the target of one of the largest restoration seeding efforts globally. Here, we present the first reference-quality genome assembly for an ecologically important subspecies of big sagebrush (A. tridentata subsp. tridentata) based on short and long reads, as well as chromatin proximity ligation data analyzed using the HiRise pipeline. The final 4.2 Gb assembly consists of 5,492 scaffolds, with nine pseudo-chromosomal scaffolds (nine scaffolds comprising at least 90% of the assembled genome; n = 9). The assembly contains an estimated 43,377 genes based on ab initio gene discovery and transcriptional data analyzed using the MAKER pipeline, with 91.37% of BUSCOs being completely assembled. The final assembly was highly repetitive, with repeat elements comprising 77.99% of the genome, making the Artemisia tridentata subsp. tridentata genome one of the most highly-repetitive plant genomes to be sequenced and assembled. This genome assembly advances studies on plant adaptation to drought and heat stress and provides a valuable tool for future genomic research. 
    more » « less