skip to main content

Title: Novel genetic code and record-setting AT-richness in the highly reduced plastid genome of the holoparasitic plant Balanophora

Plastid genomes (plastomes) vary enormously in size and gene content among the many lineages of nonphotosynthetic plants, but key lineages remain unexplored. We therefore investigated plastome sequence and expression in the holoparasitic and morphologically bizarre Balanophoraceae. The twoBalanophoraplastomes examined are remarkable, exhibiting features rarely if ever seen before in plastomes or in any other genomes. At 15.5 kb in size and with only 19 genes, they are among the most reduced plastomes known. They have no tRNA genes for protein synthesis, a trait found in only three other plastid lineages, and thusBalanophoraplastids must import all tRNAs needed for translation.Balanophoraplastomes are exceptionally compact, with numerous overlapping genes, highly reduced spacers, loss of allcis-spliced introns, and shrunken protein genes. With A+T contents of 87.8% and 88.4%, theBalanophoragenomes are the most AT-rich genomes known save for a single mitochondrial genome that is merely bloated with AT-rich spacer DNA. Most plastid protein genes inBalanophoraconsist of ≥90% AT, with several between 95% and 98% AT, resulting in the most biased codon usage in any genome described to date. A potential consequence of its radical compositional evolution is the novel genetic code used byBalanophoraplastids, in which TAG has been reassigned from stop to tryptophan. Despite its many exceptional properties, theBalanophoraplastome must be functional because all examined genes are transcribed, its only intron is correctlytrans-spliced, and its protein genes, although highly divergent, are evolving under various degrees of selective constraint.

more » « less
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Publisher / Repository:
Proceedings of the National Academy of Sciences
Date Published:
Journal Name:
Proceedings of the National Academy of Sciences
Page Range / eLocation ID:
p. 934-943
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background

    In the past three decades, several studies have predominantly relied on a small sample of the plastome to infer deep phylogenetic relationships in the species-rich Melastomataceae. Here, we report the first full plastid sequences of this family, compare general features of the sampled plastomes to other sequenced Myrtales, and survey the plastomes for highly informative regions for phylogenetics.


    Genome skimming was performed for 16 species spread across the Melastomataceae. Plastomes were assembled, annotated and compared to eight sequenced plastids in the Myrtales. Phylogenetic inference was performed using Maximum Likelihood on six different data sets, where putative biases were taken into account. Summary statistics were generated for all introns and intergenic spacers with suitable size for polymerase chain reaction (PCR) amplification and used to rank the markers by phylogenetic information.


    The majority of the plastomes sampled are conserved in gene content and order, as well as in sequence length and GC content within plastid regions and sequence classes. Departures include the putative presence ofrps16andrpl2pseudogenes in some plastomes. Phylogenetic analyses of the majority of the schemes analyzed resulted in the same topology with high values of bootstrap support. Although there is still uncertainty in some relationships, in the highest supported topologies only two nodes received bootstrap values lower than 95%.


    Melastomataceae plastomes are no exception for the general patterns observed in the genomic structure of land plant chloroplasts, being highly conserved and structurally similar to most other Myrtales. Despite the fact that the full plastome phylogeny shares most of the clades with the previously widely used and reduced data set, some changes are still observed and bootstrap support is higher. The plastome data set presented here is a step towards phylogenomic analyses in the Melastomataceae and will be a useful resource for future studies.

    more » « less
  2. Abstract Background

    Plastid genomes (plastomes) have long been recognized as highly conserved in their overall structure, size, gene arrangement and content among land plants. However, recent studies have shown that some lineages present unusual variations in some of these features. Members of the cactus family are one of these lineages, with distinct plastome structures reported across disparate lineages, including gene losses, inversions, boundary movements or loss of the canonical inverted repeat (IR) region. However, only a small fraction of cactus diversity has been analysed so far.


    Here, we investigated plastome features of the tribe Opuntieae, the remarkable prickly pear cacti, which represent one of the most diverse and important lineages of Cactaceae. We assembled de novo the plastome of 43 species, representing a comprehensive sampling of the tribe, including all seven genera, and analysed their evolution in a phylogenetic comparative framework. Phylogenomic analyses with different datasets (full plastome sequences and genes only) were performed, followed by congruence analyses to assess signals underlying contentious nodes.

    Key Results

    Plastomes varied considerably in length, from 121 to 162 kbp, with striking differences in the content and size of the IR region (contraction and expansion events), including a lack of the canonical IR in some lineages and the pseudogenization or loss of some genes. Overall, nine different types of plastomes were reported, deviating in the presence of the IR region or the genes contained in the IR. Overall, plastome sequences resolved phylogenetic relationships within major clades of Opuntieae with high bootstrap values but presented some contentious nodes depending on the dataset analysed (e.g. whole plastome vs. genes only). Congruence analyses revealed that most plastidial regions lack phylogenetic resolution, while few markers are supporting the most likely topology. Likewise, alternative topologies are driven by a handful of plastome markers, suggesting recalcitrant nodes in the phylogeny.


    Our study reveals a dynamic nature of plastome evolution across closely related lineages, shedding light on peculiar features of plastomes. Variation of plastome types across Opuntieae is remarkable in size, structure and content and can be important for the recognition of species in some major clades. Unravelling connections between the causes of plastome variation and the consequences for species biology, physiology, ecology, diversification and adaptation is a promising and ambitious endeavour in cactus research. Although plastome data resolved major phylogenetic relationships, the generation of nuclear genomic data is necessary to confront these hypotheses and assess the recalcitrant nodes further.

    more » « less
  3. Summary

    The plastid genome (plastome), while surprisingly constant in gene order and content across most photosynthetic angiosperms, exhibits variability in several unrelated lineages. During the diversification history of the legume family Fabaceae, plastomes have undergone many rearrangements, including inversions, expansion, contraction and loss of the typical inverted repeat (IR), gene loss and repeat accumulation in both shared and independent events. While legume plastomes have been the subject of study for some time, most work has focused on agricultural species in the IR‐lacking clade (IRLC) and the plant modelMedicago truncatula. The subfamily Papilionoideae, which contains virtually all of the agricultural legume species, also comprises most of the plastome variation detected thus far in the family. In this study three non‐papilioniods were included among 34 newly sequenced legume plastomes, along with 33 publicly available sequences, to assess plastome structural evolution in the subfamily. In an effort to examine plastome variation across the subfamily, approximately 20% of the sampling represents the IRLC with the remainder selected to represent the early‐branching papilionoid clades. A number of IR‐related and repeat‐mediated changes were identified and examined in a phylogenetic context. Recombination between direct repeats associated withycf2resulted in intraindividual plastome heteroplasmy. Although loss of the IR has not been reported in legumes outside of the IRLC, one genistoid taxon was found to completely lack the typical plastome IR. The role of the IR and non‐IR repeats in the progression of plastome change is discussed.

    more » « less
  4. Abstract

    Although plastid genome (plastome) structure is highly conserved across most seed plants, investigations during the past two decades have revealed several disparately related lineages that experienced substantial rearrangements. Most plastomes contain a large inverted repeat and two single‐copy regions, and a few dispersed repeats; however, the plastomes of some taxa harbour long repeat sequences (>300 bp). These long repeats make it challenging to assemble complete plastomes using short‐read data, leading to misassemblies and consensus sequences with spurious rearrangements. Single‐molecule, long‐read sequencing has the potential to overcome these challenges, yet there is no consensus on the most effective method for accurately assembling plastomes using long‐read data. We generated a pipeline,plastidGenomeAssemblyUsingLong‐read data (ptGAUL), to address the problem of plastome assembly using long‐read data from Oxford Nanopore Technologies (ONT) or Pacific Biosciences platforms. We demonstrated the efficacy of the ptGAUL pipeline using 16 published long‐read data sets. We showed that ptGAUL quickly produces accurate and unbiased assemblies using only ~50× coverage of plastome data. Additionally, we deployed ptGAUL to assemble four newJuncus(Juncaceae) plastomes using ONT long reads. Our results revealed many long repeats and rearrangements inJuncusplastomes compared with basal lineages of Poales. The ptGAUL pipeline is available on GitHub:

    more » « less
  5. Comprising 501 genera and around 14,000 species, Papilionoideae is not only the largest subfamily of Fabaceae (Leguminosae; legumes), but also one of the most extraordinarily diverse clades among angiosperms. Papilionoids are a major source of food and forage, are ecologically successful in all major biomes, and display dramatic variation in both floral architecture and plastid genome (plastome) structure. Plastid DNA-based phylogenetic analyses have greatly improved our understanding of relationships among the major groups of Papilionoideae, yet the backbone of the subfamily phylogeny remains unresolved. In this study, we sequenced and assembled 39 new plastomes that are covering key genera representing the morphological diversity in the subfamily. From 244 total taxa, we produced eight datasets for maximum likelihood (ML) analyses based on entire plastomes and/or concatenated sequences of 77 protein-coding sequences (CDS) and two datasets for multispecies coalescent (MSC) analyses based on individual gene trees. We additionally produced a combined nucleotide dataset comprising CDS plus matK gene sequences only, in which most papilionoid genera were sampled. A ML tree based on the entire plastome maximally supported all of the deep and most recent divergences of papilionoids (223 out of 236 nodes). The Swartzieae, ADA (Angylocalyceae, Dipterygeae, and Amburaneae), Cladrastis, Andira, and Exostyleae clades formed a grade to the remainder of the Papilionoideae, concordant with nine ML and two MSC trees. Phylogenetic relationships among the remaining five papilionoid lineages (Vataireoid, Dermatophyllum , Genistoid s.l., Dalbergioid s.l., and Baphieae + Non-Protein Amino Acid Accumulating or NPAAA clade) remained uncertain, because of insufficient support and/or conflicting relationships among trees. Our study fully resolved most of the deep nodes of Papilionoideae, however, some relationships require further exploration. More genome-scale data and rigorous analyses are needed to disentangle phylogenetic relationships among the five remaining lineages. 
    more » « less