skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1740874

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Centromeres in most multicellular eukaryotes are composed of long arrays of repetitive DNA sequences. Interestingly, several transposable elements, including the well-known long terminal repeat centromeric retrotransposon of maize (CRM), were found to be enriched in functional centromeres marked by the centromeric histone H3 (CENH3). Here, we report a centromeric long interspersed nuclear element (LINE), Celine, in Populus species. Celine has colonized preferentially in the CENH3-associated chromatin of every poplar chromosome, with 84% of the Celine elements localized in the CENH3-binding domains. In contrast, only 51% of the CRM elements were bound to CENH3 domains in Populus trichocarpa. These results suggest different centromere targeting mechanisms employed by Celine and CRM elements. Nevertheless, the high target specificity seems to be detrimental to further amplification of the Celine elements, leading to a shorter life span and patchy distribution among plant species compared with the CRM elements. Using a phylogenetically guided approach, we were able to identify Celine-like LINE elements in tea plant (Camellia sinensis) and green ash tree (Fraxinus pennsylvanica). The centromeric localization of these Celine-like LINEs was confirmed in both species. We demonstrate that the centromere targeting property of Celine-like LINEs is of primitive origin and has been conserved among distantly related plant species. 
    more » « less
  2. Abstract Subgenome dominance has been reported in diverse allopolyploid species, where genes from one subgenome are preferentially retained and are more highly expressed than those from other subgenome(s). However, the molecular mechanisms responsible for subgenome dominance remain poorly understood. Here, we develop genome-wide map of accessible chromatin regions (ACRs) in cultivated strawberry (2n = 8x = 56, with A, B, C, D subgenomes). Each ACR is identified as an MNase hypersensitive site (MHS). We discover that the dominant subgenome A contains a greater number of total MHSs and MHS per gene than the submissive B/C/D subgenomes. Subgenome A suffers fewer losses of MHS-related DNA sequences and fewer MHS fragmentations caused by insertions of transposable elements. We also discover that genes and MHSs related to stress response have been preferentially retained in subgenome A. We conclude that preservation of genes and their cognate ACRs, especially those related to stress responses, play a major role in the establishment of subgenome dominance in octoploid strawberry. 
    more » « less
  3. Abstract Short interspersed nuclear elements (SINEs) are a widespread type of small transposable element (TE). With increasing evidence for their impact on gene function and genome evolution in plants, accurate genome-scale SINE annotation becomes a fundamental step for studying the regulatory roles of SINEs and their relationship with other components in the genomes. Despite the overall promising progress made in TE annotation, SINE annotation remains a major challenge. Unlike some other TEs, SINEs are short and heterogeneous, and they usually lack well-conserved sequence or structural features. Thus, current SINE annotation tools have either low sensitivity or high false discovery rates. Given the demand and challenges, we aimed to provide a more accurate and efficient SINE annotation tool for plant genomes. The pipeline starts with maximizing the pool of SINE candidates via profile hidden Markov model-based homology search and de novo SINE search using structural features. Then, it excludes the false positives by integrating all known features of SINEs and the features of other types of TEs that can often be misannotated as SINEs. As a result, the pipeline substantially improves the tradeoff between sensitivity and accuracy, with both values close to or over 90%. We tested our tool in Arabidopsis thaliana and rice (Oryza sativa), and the results show that our tool competes favorably against existing SINE annotation tools. The simplicity and effectiveness of this tool would potentially be useful for generating more accurate SINE annotations for other plant species. The pipeline is freely available at https://github.com/yangli557/AnnoSINE. 
    more » « less
  4. Tribble, C (Ed.)
    Abstract The majority of sequenced genomes in the monocots are from species belonging to Poaceae, which include many commercially important crops. Here, we expand the number of sequenced genomes from the monocots to include the genomes of 4 related cyperids: Carex cristatella and Carex scoparia from Cyperaceae and Juncus effusus and Juncus inflexus from Juncaceae. The high-quality, chromosome-scale genome sequences from these 4 cyperids were assembled by combining whole-genome shotgun sequencing of Nanopore long reads, Illumina short reads, and Hi-C sequencing data. Some members of the Cyperaceae and Juncaceae are known to possess holocentric chromosomes. We examined the repeat landscapes in our sequenced genomes to search for potential repeats associated with centromeres. Several large satellite repeat families, comprising 3.2–9.5% of our sequenced genomes, showed dispersed distribution of large satellite repeat clusters across all Carex chromosomes, with few instances of these repeats clustering in the same chromosomal regions. In contrast, most large Juncus satellite repeats were clustered in a single location on each chromosome, with sporadic instances of large satellite repeats throughout the Juncus genomes. Recognizable transposable elements account for about 20% of each of the 4 genome assemblies, with the Carex genomes containing more DNA transposons than retrotransposons while the converse is true for the Juncus genomes. These genome sequences and annotations will facilitate better comparative analysis within monocots. 
    more » « less
  5. Canonical CRISPR-Cas9 genome editing technique has profoundly impacted the fields of plant biology, biotechnology, and crop improvement. Since non-homologous end joining (NHEJ) is usually considered to generate random indels, its high efficiency mutation is generally not pertinent to precise editing. Homology-directed repair (HDR) can mediate precise editing with supplied donor DNA, but it suffers from extreme low efficiency in higher plants. Therefore, precision editing in plants will be facilitated by the ability to predict NHEJ repair outcome and to improve HDR efficiency. Here, we report that NHEJ-mediated single nucleotide insertion at different rice genes is predictable based on DNA sequences at the target loci. Three mutation prediction tools (inDelphi, FORECasT, and SPROUT) have been validated in the rice plant system. We also evaluated the chimeric guide RNA (cgRNA) and Cas9-Retron precISe Parallel Editing via homologY (CRISPEY) strategies to facilitate donor template supply for improving HDR efficiency in Nicotiana benthamiana and rice. However, neither cgRNA nor CRISPEY improved plant HDR editing efficiency in this study. Interestingly, our data indicate that tethering of 200–250 nucleotides long sequence to either 5′ or 3′ ends of guide RNA did not significantly affect Cas9 cleavage activity. 
    more » « less
  6. null (Ed.)
    Abstract Motivation The standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate ‘phylogenetic support’). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted. Results In this study, we introduce a new sequence-aware non-parametric resampling technique, which we refer to as RAWR (‘RAndom Walk Resampling’). RAWR consists of random walks that synthesize and extend the standard bootstrap method and the ‘mirrored inputs’ idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR’s performance is compared to the state-of-the-art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support. We also conduct a re-analysis of large-scale genomic sequence data from a recent study of Darwin’s finches. Our findings clarify phylogenetic uncertainty in a charismatic clade that serves as an important model for complex adaptive evolution. Availability and implementation Data and software are publicly available under open-source software and open data licenses at: https://gitlab.msu.edu/liulab/RAWR-study-datasets-and-scripts. 
    more » « less