skip to main content

Title: A dedicated target capture approach reveals variable genetic markers across micro‐ and macro‐evolutionary time scales in palms

Understanding the genetics of biological diversification across micro‐ and macro‐evolutionary time scales is a vibrant field of research for molecular ecologists as rapid advances in sequencing technologies promise to overcome former limitations. In palms, an emblematic, economically and ecologically important plant family with high diversity in the tropics, studies of diversification at the population and species levels are still hampered by a lack of genomic markers suitable for the genotyping of large numbers of recently diverged taxa. To fill this gap, we used a whole genome sequencing approach to develop target sequencing for molecular markers in 4,184 genome regions, including 4,051 genes and 133 non‐genic putatively neutral regions. These markers were chosen to cover a wide range of evolutionary rates allowing future studies at the family, genus, species and population levels. Special emphasis was given to the avoidance of copy number variation during marker selection. In addition, a set of 149 well‐known sequence regions previously used as phylogenetic markers by the palm biological research community were included in the target regions, to open the possibility to combine and jointly analyse already available data sets with genomic data to be produced with this new toolkit. The bait set was effective for species belonging to all three palm sub‐families tested (Arecoideae, Ceroxyloideae and Coryphoideae), with high mapping rates, specificity and efficiency. The number of high‐quality single nucleotide polymorphisms (SNPs) detected at both the sub‐family and population levels facilitates efficient analyses of genomic diversity across micro‐ and macro‐evolutionary time scales.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Molecular Ecology Resources
Page Range / eLocation ID:
p. 221-234
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    Spirodela polyrhizais a fast‐growing aquatic monocot with highly reduced morphology, genome size and number of protein‐coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158‐Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome‐wide physical maps combined with high‐coverage short‐read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of therDNArepeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, smallRNAsequencing revealed 29 Spirodela‐specific microRNA, with only two being shared withElaeis guineensis(oil palm) andMusa balbisiana(banana). CombiningDNAmethylation data and smallRNAsequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:IntactLTRratio of 8.2. Interestingly, we found that Spirodela has the lowest globalDNAmethylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non‐essential protein coding genes,rDNAandLTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large‐scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family.

    more » « less
  2. Abstract

    Islands are natural laboratories for studying patterns and processes of evolution. Research on island endemic birds has revealed elevated speciation rates and rapid phenotypic evolution in several groups (e.g. white-eyes, Darwin’s finches). However, understanding the evolutionary processes behind these patterns requires an understanding of how genotypes map to novel phenotypes. To date, there are few high-quality reference genomes for species found on islands. Here, we sequence the genome of one of Ernst Mayr’s “great speciators,” the collared kingfisher (Todiramphus chloris collaris). Utilizing high molecular weight DNA and linked-read sequencing technology, we assembled a draft high-quality genome with highly contiguous scaffolds (scaffold N50 = 19 Mb). Based on universal single-copy orthologs, we estimated a gene space completeness of 96.6% for the draft genome assembly. The population demographic history analyses reveal a distinct pattern of contraction and expansion in population size throughout the Pleistocene. Comparative genomic analysis of gene family evolution revealed that species-specific and rapidly expanding gene families in the collared kingfisher (relative to other Coraciiformes) are mainly involved in the ErbB signaling pathway and focal adhesion. Todiramphus kingfishers are a species-rich group that has become a focus of speciation research. This draft genome will be a platform for future taxonomic, phylogeographic, and speciation research in the group. For example, target genes will enable testing of changes in sensory structures associated with changes in vision and taste genes across kingfishers.

    more » « less
  3. Abstract

    High‐throughput DNA sequencing facilitates the analysis of large portions of the genome in nonmodel organisms, ensuring high accuracy of population genetic parameters. However, empirical studies evaluating the appropriate sample size for these kinds of studies are still scarce. In this study, we use double‐digest restriction‐associated DNA sequencing (ddRADseq) to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations ofAmphirrhox longifolia(Violaceae), a nonmodel plant species for which no reference genome is available. We used resampling techniques to construct simulated populations with a random subset of individuals and SNPs to determine how many individuals and biallelic markers should be sampled for accurate estimates of intra‐ and interpopulation genetic diversity. We identified 3646 and 4900 polymorphic SNPs for the two populations ofA. longifolia, respectively. Our simulations show that, overall, a sample size greater than eight individuals has little impact on estimates of genetic diversity withinA. longifoliapopulations, when 1000 SNPs or higher are used. Our results also show that even at a very small sample size (i.e. two individuals), accurate estimates ofFSTcan be obtained with a large number of SNPs (≥1500). These results highlight the potential of high‐throughput genomic sequencing approaches to address questions related to evolutionary biology in nonmodel organisms. Furthermore, our findings also provide insights into the optimization of sampling strategies in the era of population genomics.

    more » « less
  4. Abstract

    Understanding the consequences of exotic diseases on native forests is important to evolutionary ecology and conservation biology because exotic pathogens have drastically altered US eastern deciduous forests.Cornus floridaL. (flowering dogwood tree) is one such species facing heavy mortality. Characterizing the genetic structure ofC. floridapopulations and identifying the genetic signature of adaptation to dogwood anthracnose (an exotic pathogen responsible for high mortality) remain vital for conservation efforts. By integrating genetic data from genotype by sequencing (GBS) of 289 trees across the host species range and distribution of disease, we evaluated the spatial patterns of genetic variation and population genetic structure ofC. floridaand compared the pattern to the distribution of dogwood anthracnose. Using genome‐wide association study and gradient forest analysis, we identified genetic loci under selection and associated with ecological and diseased regions. The results revealed signals of weak genetic differentiation of three or more subgroups nested within two clusters—explaining up to 2%–6% of genetic variation. The groups largely corresponded to the regions within and outside the eastern Hot‐Continental ecoregion, which also overlapped with areas within and outside the main distribution of dogwood anthracnose. The fungal sequences contained in the GBS data of sampled trees bolstered visual records of disease at sampled locations and were congruent with the reported range ofDiscula destructiva, suggesting that fungal sequences within‐host genomic data were informative for detecting or predicting disease. The genetic diversity between populations at diseased vs. disease‐free sites across the range ofC. floridashowed no significant difference. We identified 72 single‐nucleotide polymorphisms (SNPs) from 68 loci putatively under selection, some of which exhibited abrupt turnover in allele frequencies along the borders of the Hot‐Continental ecoregion and the range of dogwood anthracnose. One such candidate SNP was independently identified in two prior studies as a possible L‐type lectin‐domain containing receptor kinase. Although diseased and disease‐free areas do not significantly differ in genetic diversity, overall there are slight trends to indicate marginally smaller amounts of genetic diversity in disease‐affected areas. Our results were congruent with previous studies that were based on a limited number of genetic markers in revealing high genetic variation and weak population structure inC. florida.

    more » « less
  5. Abstract

    Understanding patterns of diversity across macro (e.g. species‐level) and micro (e.g. molecular‐level) scales can shed light on community function and stability by elucidating the abiotic and biotic drivers of diversity within ecological communities. We examined the relationships among taxonomic and genetic metrics of diversity in freshwater mussels (Bivalvia: Unionidae), an ecologically important and species‐rich group in the southeastern United States. Using quantitative community surveys and reduced‐representation genome sequencing across 22 sites in seven rivers and two river basins, we surveyed 68 mussel species and sequenced 23 of these species to characterize intrapopulation genetic variation. We tested for the presence of species diversity–abundance correlations (i.e. the more‐individuals hypothesis, MIH), species‐genetic diversity correlations (SGDCs) and abundance‐genetic diversity correlations (AGDCs) across all sites to evaluate relationships between different metrics of diversity. Sites with greater cumulative multispecies density (a standardized metric of abundance) had a greater number of species, consistent with the MIH hypothesis. Intrapopulation genetic diversity was strongly associated with the density of most species, indicating the presence of AGDCs. However, there was no consistent evidence for SGDCs. Although sites with greater overall densities of mussels had greater species richness, sites with higher genetic diversity did not always exhibit positive correlations with species richness, suggesting that there are spatial and evolutionary scales at which the processes influencing community‐level diversity and intraspecific diversity differ. Our work reveals the importance of local abundance as indicator (and possibly a driver) of intrapopulation genetic diversity.

    more » « less