skip to main content

Title: Enabling evolutionary studies at multiple scales in Apocynaceae through Hyb‐Seq

Apocynaceae is the 10th largest flowering plant family and a focus for study of plant–insect interactions, especially as mediated by secondary metabolites. However, it has few genomic resources relative to its size. Target capture sequencing is a powerful approach for genome reduction that facilitates studies requiring data from the nuclear genome in non‐model taxa, such as Apocynaceae.


Transcriptomes were used to design probes for targeted sequencing of putatively single‐copy nuclear genes across Apocynaceae. The sequences obtained were used to assess the success of the probe design, the intrageneric and intraspecific variation in the targeted genes, and the utility of the genes for inferring phylogeny.


From 853 candidate nuclear genes, 835 were consistently recovered in single copy and were variable enough for phylogenomics. The inferred gene trees were useful for coalescent‐based species tree analysis, which showed all subfamilies of Apocynaceae as monophyletic, while also resolving relationships among species within the genusApocynum. Intraspecific comparison ofElytropus chilensisindividuals revealed numerous single‐nucleotide polymorphisms with potential for use in population‐level studies.


Community use of this Hyb‐Seq probe set will facilitate and promote progress in the study of Apocynaceae across scales from population genomics to phylogenomics.

more » « less
Award ID(s):
1655553 1655223 1457473
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Applications in Plant Sciences
Medium: X
Sponsoring Org:
National Science Foundation
More Like this

    New sequencing technologies facilitate the generation of large‐scale molecular data sets for constructing the plant tree of life. We describe a new probe set for target enrichment sequencing to generate nuclear sequence data to build phylogenetic trees with any flagellate land plants, including hornworts, liverworts, mosses, lycophytes, ferns, and all gymnosperms.


    We leveraged existing transcriptome and genome sequence data to design the GoFlag 451 probes, a set of 56,989 probes for target enrichment sequencing of 451 exons that are found in 248 single‐copy or low‐copy nuclear genes across flagellate plant lineages.


    Our results indicate that target enrichment using the GoFlag451 probe set can provide large nuclear data sets that can be used to resolve relationships among both distantly and closely related taxa across the flagellate land plants. We also describe the GoFlag 408 probes, an optimized probe set covering 408 of the 451 exons from the GoFlag 451 probe set that is commercialized by RAPiD Genomics.


    A target enrichment approach using the new probe set provides a relatively low‐cost solution to obtain large‐scale nuclear sequence data for inferring phylogenetic relationships across flagellate land plants.

    more » « less
  2. Premise

    Putatively single‐copy nuclear (SCN) loci, which are identified using genomic resources of closely related species, are ideal for phylogenomic inference. However, suitable genomic resources are not available for many clades, including Melastomataceae. We introduce a versatile approach to identify SCN loci for clades with few genomic resources and use it to develop probes for target enrichment in the distantly relatedMemecylonandTibouchina(Melastomataceae).


    We present a two‐tiered pipeline. First, we identified putatively SCN loci using MarkerMiner and transcriptomes from distantly related species in Melastomataceae. Published loci and genes of functional significance were then added (384 total loci). Second, using HybPiper, we retrieved 689 homologous template sequences for these loci using genome‐skimming data from within the focal clades.


    We sequenced 193 loci common toMemecylonandTibouchina. Probes designed from 56 template sequences successfully targeted sequences in both clades. Probes designed from genome‐skimming data within a focal clade were more successful than probes designed from other sources.


    Our pipeline successfully identified and targeted SCN loci inMemecylonandTibouchina, enabling phylogenomic studies in both clades and potentially across Melastomataceae. This pipeline could be easily applied to other clades with few genomic resources.

    more » « less
  3. Abstract

    Understanding the genetics of biological diversification across micro‐ and macro‐evolutionary time scales is a vibrant field of research for molecular ecologists as rapid advances in sequencing technologies promise to overcome former limitations. In palms, an emblematic, economically and ecologically important plant family with high diversity in the tropics, studies of diversification at the population and species levels are still hampered by a lack of genomic markers suitable for the genotyping of large numbers of recently diverged taxa. To fill this gap, we used a whole genome sequencing approach to develop target sequencing for molecular markers in 4,184 genome regions, including 4,051 genes and 133 non‐genic putatively neutral regions. These markers were chosen to cover a wide range of evolutionary rates allowing future studies at the family, genus, species and population levels. Special emphasis was given to the avoidance of copy number variation during marker selection. In addition, a set of 149 well‐known sequence regions previously used as phylogenetic markers by the palm biological research community were included in the target regions, to open the possibility to combine and jointly analyse already available data sets with genomic data to be produced with this new toolkit. The bait set was effective for species belonging to all three palm sub‐families tested (Arecoideae, Ceroxyloideae and Coryphoideae), with high mapping rates, specificity and efficiency. The number of high‐quality single nucleotide polymorphisms (SNPs) detected at both the sub‐family and population levels facilitates efficient analyses of genomic diversity across micro‐ and macro‐evolutionary time scales.

    more » « less
  4. Premise

    Comprising five families that vastly differ in species richness—ranging from Gelsemiaceae with 13 species to the Rubiaceae with 13,775 species—members of the Gentianales are often among the most species‐rich and abundant plants in tropical forests. Despite considerable phylogenetic work within particular families and genera, several alternative topologies for family‐level relationships within Gentianales have been presented in previous studies.


    Here we present a phylogenomic analysis based on nuclear genes targeted by the Angiosperms353 probe set for approximately 150 species, representing all families and approximately 85% of the formally recognized tribes. We were able to retrieve partial plastomes from off‐target reads for most taxa and infer phylogenetic trees for comparison with the nuclear‐derived trees.


    We recovered high support for over 80% of all nodes. The plastid and nuclear data are largely in agreement, except for some weakly to moderately supported relationships. We discuss the implications of our results for the order’s classification, highlighting points of increased support for previously uncertain relationships. Rubiaceae is sister to a clade comprising (Gentianaceae + Gelsemiaceae) + (Apocynaceae + Loganiaceae).


    The higher‐level phylogenetic relationships within Gentianales are confidently resolved. In contrast to recent studies, our results support the division of Rubiaceae into two subfamilies: Cinchonoideae and Rubioideae. We do not formally recognize Coptosapelteae and Luculieae within any particular subfamily but treat them as incertae sedis. Our framework paves the way for further work on the phylogenetics, biogeography, morphological evolution, and macroecology of this important group of flowering plants.

    more » « less
  5. Premise

    Multiple transitions from insect to wind pollination are associated with polyploidy and unisexual flowers inThalictrum(Ranunculaceae), yet the underlying genetics remains unknown. We generated a draft genome ofThalictrum thalictroides, a representative of a clade with ancestral floral traits (diploid, hermaphrodite, and insect pollinated) and a model for functional studies. Floral transcriptomes ofT. thalictroidesand of wind‐pollinated, andromonoeciousT. hernandeziiare presented as a resource to facilitate candidate gene discovery in flowers with different sexual and pollination systems.


    A draft genome ofT. thalictroidesand two floral transcriptomes ofT. thalictroidesandT. hernandeziiwere obtained from HiSeq 2000 Illumina sequencing and de novo assembly.


    TheT. thalictroidesde novo draft genome assembly consisted of 44,860 contigs (N50 = 12,761 bp, 243 Mbp total length) and contained 84.5% conserved embryophyte single‐copy genes. Floral transcriptomes contained representatives of most eukaryotic core genes, and most of their genes formed orthogroups.


    To validate the utility of these resources, potential candidate genes were identified for the different floral morphologies using stepwise data set comparisons. Single‐copy gene analysis and simple sequence repeat markers were also generated as a resource for population‐level and phylogenetic studies.

    more » « less