Abstract Background The tobacco thrips ( Frankliniella fusca Hinds; family Thripidae; order Thysanoptera) is an important pest that can transmit viruses such as the tomato spotted wilt orthotospovirus to numerous economically important agricultural row crops and vegetables. The structural and functional genomics within the order Thysanoptera has only begun to be explored. Within the > 7000 known thysanopteran species, the melon thrips ( Thrips palmi Karny) and the western flower thrips ( Frankliniella occidentalis Pergrande) are the only two thysanopteran species with assembled genomes. Results A genome of F. fusca was assembled by long-read sequencing of DNA from an inbred line. The final assembly size was 370 Mb with a single copy ortholog completeness of ~ 99% with respect to Insecta. The annotated genome of F. fusca was compared with the genome of its congener, F. occidentalis . Results revealed many instances of lineage-specific differences in gene content. Analyses of sequence divergence between the two Frankliniella species’ genomes revealed substitution patterns consistent with positive selection in ~ 5% of the protein-coding genes with 1:1 orthologs. Further, gene content related to its pest status, such as xenobiotic detoxification and response to an ambisense-tripartite RNA virus (orthotospovirus) infection was compared with F. occidentalis . Several F. fusca genes related to virus infection possessed signatures of positive selection. Estimation of CpG depletion, a mutational consequence of DNA methylation, revealed that F. fusca genes that were downregulated and alternatively spliced in response to virus infection were preferentially targeted by DNA methylation. As in many other insects, DNA methylation was enriched in exons in Frankliniella , but gene copies with homology to DNA methyltransferase 3 were numerous and fragmented. This phenomenon seems to be relatively unique to thrips among other insect groups. Conclusions The F. fusca genome assembly provides an important resource for comparative genomic analyses of thysanopterans. This genomic foundation allows for insights into molecular evolution, gene regulation, and loci important to agricultural pest status.
more »
« less
Identification of the expressome by machine learning on omics data
Accurate annotation of plant genomes remains complex due to the presence of many pseudogenes arising from whole-genome duplication-generated redundancy or the capture and movement of gene fragments by transposable elements. Machine learning on genome-wide epigenetic marks, informed by transcriptomic and proteomic training data, could be used to improve annotations through classification of all putative protein-coding genes as either constitutively silent or able to be expressed. Expressed genes were subclassified as able to express both mRNAs and proteins or only RNAs, and CG gene body methylation was associated only with the former subclass. More than 60,000 protein-coding genes have been annotated in the reference genome of maize inbred B73. About two-thirds of these genes are transcribed and are designated the filtered gene set (FGS). Classification of genes by our trained random forest algorithm was accurate and relied only on histone modifications or DNA methylation patterns within the gene body; promoter methylation was unimportant. Other inbred lines are known to transcribe significantly different sets of genes, indicating that the FGS is specific to B73. We accurately classified the sets of transcribed genes in additional inbred lines, arising from inbred-specific DNA methylation patterns. This approach highlights the potential of using chromatin information to improve annotations of functional genes.
more »
« less
- Award ID(s):
- 1711662
- PAR ID:
- 10169746
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 116
- Issue:
- 36
- ISSN:
- 0027-8424
- Page Range / eLocation ID:
- 18119 to 18125
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Protein translation is tightly and precisely controlled by multiple mechanisms including upstream open reading frames (uORFs), but the origins of uORFs and their role in maize are largely unexplored. In this study, an active transposition event was identified during the propagation of maize inbred line B73. The transposon, which was named BTA for ‘B73 active transposable element hAT’, creates a novel dosage-dependent hypomorphic allele of the hexose transporter gene ZmSWEET4c through insertion within the coding sequence in the first exon, and results in reduced kernel size. The BTA insertion does not affect transcript abundance but reduces protein abundance of ZmSWEET4c, probably through the introduction of a uORF. Furthermore, the introduction of BTA sequence in the exon of other genes can regulate translation efficiency without affecting their mRNA levels. A transposon capture assay revealed 79 novel insertions for BTA and BTA-like elements. These insertion sites have typical euchromatin features, including low levels of DNA methylation and high levels of H3K27ac. A putative autonomous element that mobilizes BTA and BTA-like elements was identified. Together, our results suggest a transposon-based origin of uORFs and document a new role for transposable elements to influence protein abundance and phenotypic diversity by affecting the translation rate.more » « less
-
In many plant species, a subset of transcribed genes are characterized by strictly CG-context DNA methylation, referred to as gene body methylation (gbM). The mechanisms that establish gbM are unclear, yet flowering plant species naturally without gbM lack the DNA methyltransferase, CMT3, which maintains CHG (H = A, C, or T) and not CG methylation at constitutive heterochromatin. Here, we identify the mechanistic basis for gbM establishment by expressing CMT3 in a species naturally lacking CMT3. CMT3 expression reconstituted gbM through a progression of de novo CHG methylation on expressed genes, followed by the accumulation of CG methylation that could be inherited even following loss of the CMT3 transgene. Thus, gbM likely originates from the simultaneous targeting of loci by pathways that promote euchromatin and heterochromatin, which primes genes for the formation of stably inherited epimutations in the form of CG DNA methylation.more » « less
-
Bomblies, K (Ed.)Abstract DNA methylation in plants is depleted from cis-regulatory elements in and near genes but is present in some gene bodies, including exons. Methylation in exons solely in the CG context is called gene body methylation (gbM). Methylation in exons in both CG and non-CG contexts is called TE-like methylation (teM). Assigning functions to both forms of methylation in genes has proven to be challenging. Toward that end, we utilized recent genome assemblies, gene annotations, transcription data, and methylome data to quantify common patterns of gene methylation and their relations to gene expression in maize. We found that gbM genes exist in a continuum of CG methylation levels without a clear demarcation between unmethylated genes and gbM genes. Analysis of expression levels across diverse maize stocks and tissues revealed a weak but highly significant positive correlation between gbM and gene expression except in endosperm. gbM epialleles were associated with an approximately 3% increase in steady-state expression level relative to unmethylated epialleles. In contrast to gbM genes, which were conserved and were broadly expressed across tissues, we found that teM genes, which make up about 12% of genes, are mainly silent, are poorly conserved, and exhibit evidence of annotation errors. We used these data to flag teM genes in the 26 NAM founder genome assemblies. While some teM genes are likely functional, these data suggest that the majority are not, and their inclusion can confound the interpretation of whole-genome studies.more » « less
-
Summary The ability of plant somatic cells to dedifferentiate, form somatic embryos and regenerate whole plantsin vitrohas been harnessed for both clonal propagation and as a key component of plant genetic engineering systems. Embryogenic culture response is significantly limited, however, by plant genotype in most species. This impedes advancements in both plant transformation‐based functional genomics research and crop improvement efforts. We utilized natural variation among maize inbred lines to genetically map somatic embryo generation potential in tissue culture and identify candidate genes underlying totipotency. Using a series of maize lines derived from crosses involving the culturable parent A188 and the non‐responsive parent B73, we identified a region on chromosome 3 associated with embryogenic culture response and focused on three candidate genes within the region based on genetic position and expression pattern. Two candidate genes showed no effect when ectopically expressed in B73, but the geneWox2awas found to induce somatic embryogenesis and embryogenic callus proliferation. Transgenic B73 cells with strong constitutive expression of the B73 and A188 coding sequences ofWox2awere found to produce somatic embryos at similar frequencies, demonstrating that sufficient expression of either allele could rescue the embryogenic culture phenotype. Transgenic B73 plants were regenerated from the somatic embryos without chemical selection and no pleiotropic effects were observed in theWox2aoverexpression lines in the regenerated T0 plants or in the two independent events which produced T1 progeny. In addition to linking natural variation in tissue culture response toWox2a, our data support the utility ofWox2ain enabling transformation of recalcitrant genotypes.more » « less
An official website of the United States government

