skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Using recurrent neural networks to detect supernumerary chromosomes in fungal strains causing blast diseases
Abstract The genomes of the fungus Magnaporthe oryzae that causes blast diseases on diverse grass species, including major crops, have indispensable core-chromosomes and may contain supernumerary chromosomes, also known as mini-chromosomes. These mini-chromosomes are speculated to provide effector gene mobility, and may transfer between strains. To understand the biology of mini-chromosomes, it is valuable to be able to detect whether a M. oryzae strain possesses a mini-chromosome. Here, we applied recurrent neural network models for classifying DNA sequences as arising from core- or mini-chromosomes. The models were trained with sequences from available core- and mini-chromosome assemblies, and then used to predict the presence of mini-chromosomes in a global collection of M. oryzae isolates using short-read DNA sequences. The model predicted that mini-chromosomes were prevalent in M. oryzae isolates. Interestingly, at least one mini-chromosome was present in all recent wheat isolates, but no mini-chromosomes were found in early isolates collected before 1991, indicating a preferential selection for strains carrying mini-chromosomes in recent years. The model was also used to identify assembled contigs derived from mini-chromosomes. In summary, our study has developed a reliable method for categorizing DNA sequences and showcases an application of recurrent neural networks in predictive genomics.  more » « less
Award ID(s):
2311738 2011500
PAR ID:
10535801
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Oxford
Date Published:
Journal Name:
NAR Genomics and Bioinformatics
Volume:
6
Issue:
3
ISSN:
2631-9268
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary The fungal pathogen,Magnaporthe oryzae Triticumpathotype, causing wheat blast disease was first identified in South America and recently spread across continents to South Asia and Africa. Here, we studied the genetic relationship among isolates found on the three continents.Magnaporthe oryzaestrains closely related to a South American field isolate B71 were found to have caused the wheat blast outbreaks in South Asia and Africa. Genomic variation among isolates from the three continents was examined using an improved B71 reference genome and whole‐genome sequences.We found strong evidence to support that the outbreaks in Bangladesh and Zambia were caused by the introductions of genetically separated isolates, although they were all close to B71 and, therefore, collectively referred to as the B71 branch. In addition, B71 branch strains carried at least one supernumerary mini‐chromosome. Genome assembly of a Zambian strain revealed that its mini‐chromosome was similar to the B71 mini‐chromosome but with a high level of structural variation.Our findings show that while core genomes of the multiple introductions are highly similar, the mini‐chromosomes have undergone marked diversification. The maintenance of the mini‐chromosome and rapid genomic changes suggest the mini‐chromosomes may serve important virulence or niche adaptation roles under diverse environmental conditions. 
    more » « less
  2. null (Ed.)
    Microbes (bacteria, yeasts, molds), in addition to plants and animals, were domesticated for their roles in food preservation, nutrition and flavor. Aspergillus oryzae is a domesticated filamentous fungal species traditionally used during fermentation of Asian foods and beverage, such as sake, soy sauce, and miso. To date, little is known about the extent of genome and phenotypic variation of A. oryzae isolates from different clades. Here, we used long-read Oxford Nanopore and short-read Illumina sequencing to produce a highly accurate and contiguous genome assemble of A. oryzae 14160, an industrial strain from China. To understand the relationship of this isolate, we performed phylogenetic analysis with 90 A. oryzae isolates and 1 isolate of the A. oryzae progenitor, Aspergillus flavus . This analysis showed that A. oryzae 14160 is a member of clade A, in comparison to the RIB 40 type strain, which is a member of clade F. To explore genome variation between isolates from distinct A. oryzae clades, we compared the A. oryzae 14160 genome with the complete RIB 40 genome. Our results provide evidence of independent evolution of the alpha-amylase gene duplication, which is one of the major adaptive mutations resulting from domestication. Synteny analysis revealed that both genomes have three copies of the alpha-amylase gene, but only one copy on chromosome 2 was conserved. While the RIB 40 genome had additional copies of the alpha-amylase gene on chromosomes III, and V, 14160 had a second copy on chromosome II and an third copy on chromosome VI. Additionally, we identified hundreds of lineage specific genes, and putative high impact mutations in genes involved in secondary metabolism, including several of the core biosynthetic genes. Finally, to examine the functional effects of genome variation between strains, we measured amylase activity, proteolytic activity, and growth rate on several different substrates. RIB 40 produced significantly higher levels of amylase compared to 14160 when grown on rice and starch. Accordingly, RIB 40 grew faster on rice, while 14160 grew faster on soy. Taken together, our analyses reveal substantial genome and phenotypic variation within A. oryzae . 
    more » « less
  3. Abstract The fungus Magnaporthe oryzae causes devastating diseases of crops, including rice and wheat, and in various grasses. Strains from ryegrasses have highly unstable chromosome ends that undergo frequent rearrangements, and this has been associated with the presence of retrotransposons (Magnaporthe oryzae Telomeric Retrotransposons—MoTeRs) inserted in the telomeres. The objective of the present study was to determine the mechanisms by which MoTeRs promote telomere instability. Targeted cloning, mapping, and sequencing of parental and novel telomeric restriction fragments (TRFs), along with MinION sequencing of genomic DNA allowed us to document the precise molecular alterations underlying 109 newly-formed TRFs. These included truncations of subterminal rDNA sequences; acquisition of MoTeR insertions by ‘plain’ telomeres; insertion of the MAGGY retrotransposons into MoTeR arrays; MoTeR-independent expansion and contraction of subtelomeric tandem repeats; and a variety of rearrangements initiated through breaks in interstitial telomere tracts that are generated during MoTeR integration. Overall, we estimate that alterations occurred in approximately sixty percent of chromosomes (one in three telomeres) analyzed. Most importantly, we describe an entirely new mechanism by which transposons can promote genomic alterations at exceptionally high frequencies, and in a manner that can promote genome evolution while minimizing collateral damage to overall chromosome architecture and function. 
    more » « less
  4. Komeili, Arash (Ed.)
    ABSTRACT Multipartite bacterial genome organization can confer advantages, including coordinated gene regulation and faster genome replication, but is challenging to maintain.Agrobacterium tumefacienslineages often contain a circular chromosome (Ch1), a linear chromosome (Ch2), and multiple plasmids. We previously observed that in some stocks of the C58 lab model, Ch1 and Ch2 were fused into a linear dicentric chromosome. Here we analyzedAgrobacteriumnatural isolates from the French Collection for Plant-Associated Bacteria and identified two strains distinct from C58 with fused chromosomes. Chromosome conformation capture identified integration junctions that were different from the C58 fusion strain. Genome-wide DNA replication profiling showed that both replication origins remained active. Transposon sequencing revealed that partitioning systems of both chromosome centromeres were essential. Importantly, the site-specific recombinase XerCD is required for the survival of the strains containing the fusion chromosome. Our findings show that replicon fusion occurs in natural environments and that balanced replication arm sizes and proper resolution systems enable the survival of such strains. IMPORTANCEMost bacterial genomes are monopartite with a single, circular chromosome. However, some species, likeAgrobacterium tumefaciens, carry multiple chromosomes. Emergence of multipartite genomes is often related to adaptation to specific niches, including pathogenesis or symbiosis. Multipartite genomes confer certain advantages; however, maintaining this complex structure can present significant challenges. We previously reported a laboratory-propagated lineage ofA. tumefaciensstrain C58 in which the circular and linear chromosomes fused to form a single dicentric chromosome. Here we discovered two geographically separated environmental isolates ofA. tumefacienscontaining fused chromosomes with integration junctions different from the C58 fusion chromosome, revealing the constraints and diversification of this process. We found that balanced replication arm sizes and the repurposing of multimer resolution systems enable the survival and stable maintenance of dicentric chromosomes. These findings reveal how multipartite genomes function across different bacterial species and the role of genomic plasticity in bacterial genetic diversification. 
    more » « less
  5. Telomeres form the ends of linear chromosomes and usually comprise protein complexes that bind to simple repeated sequence motifs that are added to the 3′ ends of DNA by the telomerase reverse transcriptase (TERT). One of the primary functions attributed to telomeres is to solve the “end-replication problem” which, if left unaddressed, would cause gradual, inexorable attrition of sequences from the chromosome ends and, eventually, loss of viability. Telomere-binding proteins also protect the chromosome from 5′ to 3′ exonuclease action, and disguise the chromosome ends from the double-strand break repair machinery whose illegitimate action potentially generates catastrophic chromosome aberrations. Telomeres are of special interest in the blast fungus, Pyricularia , because the adjacent regions are enriched in genes controlling interactions with host plants, and the chromosome ends show enhanced polymorphism and genetic instability. Previously, we showed that telomere instability in some P. oryzae strains is caused by novel retrotransposons (MoTeRs) that insert in telomere repeats, generating interstitial telomere sequences that drive frequent, break-induced rearrangements. Here, we sought to gain further insight on telomeric involvement in shaping Pyricularia genome architecture by characterizing sequence polymorphisms at chromosome ends, and surrounding internalized MoTeR loci (relics) and interstitial telomere repeats. This provided evidence that telomere dynamics have played historical, and likely ongoing, roles in shaping the Pyricularia genome. We further demonstrate that even telomeres lacking MoTeR insertions are poorly preserved, such that the telomere-adjacent sequences exhibit frequent presence/absence polymorphism, as well as exchanges with the genome interior. Using TERT knockout experiments, we characterized chromosomal responses to failed telomere maintenance which suggested that much of the MoTeR relic-/interstitial telomere-associated polymorphism could be driven by compromised telomere function. Finally, we describe three possible examples of a phenomenon known as “Adaptive Telomere Failure,” where spontaneous losses of telomere maintenance drive rapid accumulation of sequence polymorphism with possible adaptive advantages. Together, our data suggest that telomere maintenance is frequently compromised in Pyricularia but the chromosome alterations resulting from telomere failure are not as catastrophic as prior research would predict, and may, in fact, be potent drivers of adaptive polymorphism. 
    more » « less