skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Modeling chromatin state from sequence across angiosperms using recurrent convolutional neural networks
Core Ideas Cross‐species models of chromatin state from sequence are comparable or superior to within‐species models. Model performance is highest on accessible regions open in many tissues. Transcription factor motifs can be ranked by importance to each species and chromatin state.  more » « less
Award ID(s):
1934384
PAR ID:
10464258
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
The Plant Genome
Volume:
15
Issue:
3
ISSN:
1940-3372
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Wittkopp, Patricia (Ed.)
    Abstract In Drosophila melanogaster and D. simulans head tissue, 60% of orthologous genes show evidence of sex-biased expression in at least one species. Of these, ∼39% (2,192) are conserved in direction. We hypothesize enrichment of open chromatin in the sex where we see expression bias and closed chromatin in the opposite sex. Male-biased orthologs are significantly enriched for H3K4me3 marks in males of both species (∼89% of male-biased orthologs vs. ∼76% of unbiased orthologs). Similarly, female-biased orthologs are significantly enriched for H3K4me3 marks in females of both species (∼90% of female-biased orthologs vs. ∼73% of unbiased orthologs). The sex-bias ratio in female-biased orthologs was similar in magnitude between the two species, regardless of the closed chromatin (H3K27me2me3) marks in males. However, in male-biased orthologs, the presence of H3K27me2me3 in both species significantly reduced the correlation between D. melanogaster sex-bias ratio and the D. simulans sex-bias ratio. Male-biased orthologs are enriched for evidence of positive selection in the D. melanogaster group. There are more male-biased genes than female-biased genes in both species. For orthologs with gains/losses of sex-bias between the two species, there is an excess of male-bias compared to female-bias, but there is no consistent pattern in the relationship between H3K4me3 or H3K27me2me3 chromatin marks and expression. These data suggest chromatin state is a component of the maintenance of sex-biased expression and divergence of sex-bias between species is reflected in the complexity of the chromatin status. 
    more » « less
  2. Abstract We can now analyze 3D physical interactions of chromatin regions with chromatin conformation capture technologies, in addition to the 1D chromatin state annotations, but methods to integrate this information are lacking. We propose a method to integrate the chromatin state of interacting regions into a vector representation through the contact-weighted sum of chromatin states. Unsupervised clustering on integrated chromatin states and Micro-C contacts reveals common patterns of chromatin interaction signatures. This provides an integrated view of the complex dynamics of concurrent change occurring in chromatin state and in chromatin interaction, adding another layer of annotation beyond chromatin state or Hi-C contact separately. 
    more » « less
  3. Abstract A large-scale application of the “stacked modeling” approach for chromatin state discovery previously provides a single “universal” chromatin state annotation of thehumangenome based jointly on data from many cell and tissue types. Here, we produce an analogous chromatin state annotation formousebased on 901 datasets assaying 14 chromatin marks in 26 cell or tissue types. To characterize each chromatin state, we relate the states to external annotations and compare them to analogously definedhumanstates. We expect the universal chromatin state annotation formouseto be a useful resource for studying this key model organism’s genome. 
    more » « less
  4. Abstract MotivationGenome-wide maps of epigenetic modifications are powerful resources for non-coding genome annotation. Maps of multiple epigenetics marks have been integrated into cell or tissue type-specific chromatin state annotations for many cell or tissue types. With the increasing availability of multiple chromatin state maps for biologically similar samples, there is a need for methods that can effectively summarize the information about chromatin state annotations within groups of samples and identify differences across groups of samples at a high resolution. ResultsWe developed CSREP, which takes as input chromatin state annotations for a group of samples. CSREP then probabilistically estimates the state at each genomic position and derives a representative chromatin state map for the group. CSREP uses an ensemble of multi-class logistic regression classifiers that predict the chromatin state assignment of each sample given the state maps from all other samples. The difference in CSREP’s probability assignments for the two groups can be used to identify genomic locations with differential chromatin state assignments. Using groups of chromatin state maps of a diverse set of cell and tissue types, we demonstrate the advantages of using CSREP to summarize chromatin state maps and identify biologically relevant differences between groups at a high resolution. Availability and implementationThe CSREP source code and generated data are available at http://github.com/ernstlab/csrep. Supplementary informationSupplementary data are available at Bioinformatics online. 
    more » « less
  5. ABSTRACT The physical organization of DNA within the nucleus is fundamental to a wide range of biological processes. The experimental investigation of the structure of genomic DNA remains challenging due to its large size and hierarchical arrangement. These challenges present considerable opportunities for combined experimental and modeling approaches. Physics‐based computational models, in particular, have emerged as essential tools for probing chromatin structure and dynamics across a wide range of length scales. Such models must necessarily be capable of bridging scales, and each scale presents its own subtleties and intricacies. This review discusses recent methodological advances in genomic structural modeling, emphasizing the need for multiscale integration to capture the hierarchical organization and molecular mechanisms that underlie chromatin structure and function. We present an analysis of state‐of‐the‐art methods, as well as a perspective on challenges and future opportunities across length scales ranging from bare DNA to nucleosomes and chromatin fibers, up to TAD and chromosome‐scale models. We emphasize models that connect genome organization to gene expression, models that leverage emerging machine learning capabilities, and models that develop multiscale approaches. We examine gaps in experimental data that computational models are poised to address and propose directions for future research that bridge theory and experiment in DNA structural biology. 
    more » « less