skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interpretable online network dictionary learning for inferring long-range chromatin interactions
Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm calledonline convex network dictionary learning(online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data fromDrosophila MelanogasterS2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies.  more » « less
Award ID(s):
1956384 2023239 2206296
PAR ID:
10525188
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Schlick, Tamar
Publisher / Repository:
Plos Computational Biology
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
20
Issue:
5
ISSN:
1553-7358
Page Range / eLocation ID:
e1012095
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. van Steensel, Bas (Ed.)
    Extra-chromosomal selfish DNA elements can evade the risk of being lost at every generation by behaving as chromosome appendages, thereby ensuring high fidelity segregation and stable persistence in host cell populations. The yeast 2-micron plasmid and episomes of the mammalian gammaherpes and papilloma viruses that tether to chromosomes and segregate by hitchhiking on them exemplify this strategy. We document for the first time the utilization of a SWI/SNF-type chromatin remodeling complex as a conduit for chromosome association by a selfish element. One principal mechanism for chromosome tethering by the 2-micron plasmid is the bridging interaction of the plasmid partitioning proteins (Rep1 and Rep2) with the yeast RSC2 complex and the plasmid partitioning locusSTB. We substantiate this model by multiple lines of evidence derived from genomics, cell biology and interaction analyses. We describe a Rep-STBbypass system in which a plasmid engineered to non-covalently associate with the RSC complex mimics segregation by chromosome hitchhiking. Given the ubiquitous prevalence of SWI/SNF family chromatin remodeling complexes among eukaryotes, it is likely that the 2-micron plasmid paradigm or analogous ones will be encountered among other eukaryotic selfish elements. 
    more » « less
  2. Abstract Frogs are an ecologically diverse and phylogenetically ancient group of anuran amphibians that include important vertebrate cell and developmental model systems, notably the genusXenopus. Here we report a high-quality reference genome sequence for the western clawed frog,Xenopus tropicalis, along with draft chromosome-scale sequences of three distantly related emerging model frog species,Eleutherodactylus coqui,Engystomops pustulosus, andHymenochirus boettgeri. Frog chromosomes have remained remarkably stable since the Mesozoic Era, with limited Robertsonian (i.e., arm-preserving) translocations and end-to-end fusions found among the smaller chromosomes. Conservation of synteny includes conservation of centromere locations, marked by centromeric tandem repeats associated with Cenp-a binding surrounded by pericentromeric LINE/L1 elements. This work explores the structure of chromosomes across frogs, using a dense meiotic linkage map forX. tropicalisand chromatin conformation capture (Hi-C) data for all species. Abundant satellite repeats occupy the unusually long (~20 megabase) terminal regions of each chromosome that coincide with high rates of recombination. Both embryonic and differentiated cells show reproducible associations of centromeric chromatin and of telomeres, reflecting a Rabl-like configuration. Our comparative analyses reveal 13 conserved ancestral anuran chromosomes from which contemporary frog genomes were constructed. 
    more » « less
  3. Kelleher, Erin S (Ed.)
    Centromeres reside in rapidly evolving, repeat-rich genomic regions, despite their essential function in chromosome segregation. Across organisms, centromeres are rich in selfish genetic elements such as transposable elements and satellite DNAs that can bias their transmission through meiosis. However, these elements still need to cooperate at some level and contribute to, or avoid interfering with, centromere function. To gain insight into the balance between conflict and cooperation at centromeric DNA, we take advantage of the close evolutionary relationships within theDrosophila simulansclade—D.simulans,D.sechellia, andD.mauritiana—and their relative,D.melanogaster. Using chromatin profiling combined with high-resolution fluorescence in situ hybridization on stretched chromatin fibers, we characterize all centromeres across these species. We discovered dramatic centromere reorganization involving recurrent shifts between retroelements and satellite DNAs over short evolutionary timescales. We also reveal the recent origin (<240 Kya) of telocentric chromosomes inD.sechellia, where the X and fourth centromeres now sit on telomere-specific retroelements. Finally, the Y chromosome centromeres, which are the only chromosomes that do not experience female meiosis, do not show dynamic cycling between satDNA and TEs. The patterns of rapid centromere turnover in these species are consistent with genetic conflicts in the female germline and have implications for centromeric DNA function and karyotype evolution. Regardless of the evolutionary forces driving this turnover, the rapid reorganization of centromeric sequences over short evolutionary timescales highlights their potential as hotspots for evolutionary innovation. 
    more » « less
  4. Summary Evolutionarily conserved DEK domain‐containing proteins have been implicated in multiple chromatin‐related processes, mRNA splicing and transcriptional regulation in eukaryotes.Here, we show that two DEK proteins, DEK3 and DEK4, control the floral transition inArabidopsis. DEK3 and DEK4 directly associate with chromatin of related flowering repressors,FLOWERING LOCUS C(FLC), and its two homologs,MADS AFFECTING FLOWERING4(MAF4) andMAF5, to promote their expression.The binding of DEK3 and DEK4 to a histone octamerin vivoaffects histone modifications atFLC,MAF4andMAF5loci. In addition, DEK3 and DEK4 interact with RNA polymerase II and promote the association of RNA polymerase II withFLC,MAF4andMAF5chromatin to promote their expression.Our results show that DEK3 and DEK4 directly interact with chromatin to facilitate the transcription of key flowering repressors and thus prevent precocious flowering inArabidopsis. 
    more » « less
  5. RNA polymerase elongation along the gene body is tightly regulated to ensure proper transcription and alternative splicing events. Understanding the mechanism and factors critical in regulating the rate of RNA polymerase II elongation and processivity is clearly important. Recently we showed that PARP1, a well-known DNA repair protein, when bound to chromatin, regulates RNA polymerase II elongation. However, the mechanism by which it does so is not known. In the current study, we aimed to tease out how PARP1 regulates RNAPII elongation. We show, both in vivo and in vitro, that PARP1 binds directly to the Integrator subunit 3 (IntS3), a member of the elongation Integrator complex. The association between the two proteins is mediated via the C-terminal domain of PARP1 to the C-terminal domain of IntS3. Interestingly, the occupancy of IntS3 along two PARP1 target genes mimicked that of PARP1, suggesting a role in its recruitment/assembly of elongation factors. Indeed, the knockdown of PARP1 resulted in differential chromatin association and gene occupancy of IntS3 and other key elongation factors. Most of these PARP1-mediated effects were due to the physical presence of PARP1 rather than its PARylation activity. These studies argue that PARP1 controls the progressive RNAPII elongation complexes. In summary, we present a platform to begin to decipher PARP1′s role in recruiting/scaffolding elongation factors along the gene body regions during RNA polymerase II elongation and gene regulation. 
    more » « less