skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: From e-voucher to genomic data: Preserving archive specimens as demonstrated with medically important mosquitoes (Diptera: Culicidae) and kissing bugs (Hemiptera: Reduviidae)
Scientific collections such as the U.S. National Museum (USNM) are critical to filling knowledge gaps in molecular systematics studies. The global taxonomic impediment has resulted in a reduction of expert taxonomists generating new collections of rare or understudied taxa and these large historic collections may be the only reliable source of material for some taxa. Integrated systematics studies using both morphological examinations and DNA sequencing are often required for resolving many taxonomic issues but as DNA methods often require partial or complete destruction of a sample, there are many factors to consider before implementing destructive sampling of specimens within scientific collections. We present a methodology for the use of archive specimens that includes two crucial phases: 1) thoroughly documenting specimens destined for destructive sampling—a process called electronic vouchering, and 2) the pipeline used for whole genome sequencing of archived specimens, from extraction of genomic DNA to assembly of putative genomes with basic annotation. The process is presented for eleven specimens from two different insect subfamilies of medical importance to humans: Anophelinae (Diptera: Culicidae)—mosquitoes and Triatominae (Hemiptera: Reduviidae)—kissing bugs. Assembly of whole mitochondrial genome sequences of all 11 specimens along with the results of an ortholog search and BLAST against the NCBI nucleotide database are also presented.  more » « less
Award ID(s):
1754376
PAR ID:
10274852
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Editor(s):
Oliveira, Pedro L.
Date Published:
Journal Name:
PLOS ONE
Volume:
16
Issue:
2
ISSN:
1932-6203
Page Range / eLocation ID:
e0247068
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The flora and fauna of island systems, especially those in the Indo-Pacific, are renowned for their high diversification rates and outsized contribution to the development of evolutionary theories. The total diversity of geographic radiations of many Indo-Pacific fauna is often incompletely sampled in phylogenetic studies due to the difficulty in obtaining single island endemic forms across the Pacific and the relatively poor performance of degraded DNA when using museum specimens for inference of evolutionary relationships. New methods for production and analysis of genome-wide datasets sourced from degraded DNA are facilitating insights into the complex evolutionary histories of these influential island faunas. Here, we leverage whole genome resequencing (20X average coverage) and extensive sampling of all taxonomic diversity within Todiramphus kingfishers, a rapid radiation of largely island endemic Great Speciators. We find that whole genome datasets do not outright resolve the evolutionary relationships of this clade: four types of molecular markers (UCEs, BUSCOs, SNPs, and mtDNA) and tree building methods did not find a single well-supported and concordant species-level topology. We then uncover evidence of widespread incomplete lineage sorting and both ancient and contemporary gene flow and demonstrate how these factors contribute to conflicting evolutionary histories. Our complete taxonomic sampling allowed us to further identify a novel case of mitochondrial capture between two allopatric species, suggesting a potential historical (but since lost) hybrid zone as islands were successively colonized. Taken together, these results highlight how increased genomic and taxon sampling can reveal complex evolutionary patterns in rapid island radiations. 
    more » « less
  2. Over the past decade, museum genomics studies have focused on obtaining DNA of sufficient quality and quantity for sequencing from fluid-preserved natural history specimens, primarily to be used in systematic studies. While these studies have opened windows to evolutionary and biodiversity knowledge of many species worldwide, published works often focus on the success of these DNA sequencing efforts, which is undoubtedly less common than obtaining minimal or sometimes no DNA or unusable sequence data from specimens in natural history collections. Here, we attempt to obtain and sequence DNA extracts from 115 fresh and 41 degraded samples of homalopsid snakes, as well as from two degraded samples of a poorly known snake, Hydrablabes periops . Hydrablabes has been suggested to belong to at least two different families (Natricidae and Homalopsidae) and with no fresh tissues known to be available, intractable museum specimens currently provide the only opportunity to determine this snake’s taxonomic affinity. Although our aim was to generate a target-capture dataset for these samples, to be included in a broader phylogenetic study, results were less than ideal due to large amounts of missing data, especially using the same downstream methods as with standard, high-quality samples. However, rather than discount results entirely, we used mapping methods with references and pseudoreferences, along with phylogenetic analyses, to maximize any usable molecular data from our sequencing efforts, identify the taxonomic affinity of H. periops , and compare sequencing success between fresh and degraded tissue samples. This resulted in largely complete mitochondrial genomes for five specimens and hundreds to thousands of nuclear loci (ultra-conserved loci, anchored-hybrid enrichment loci, and a variety of loci frequently used in squamate phylogenetic studies) from fluid-preserved snakes, including a specimen of H. periops from the Field Museum of Natural History collection. We combined our H. periops data with previously published genomic and Sanger-sequenced datasets to confirm the familial designation of this taxon, reject previous taxonomic hypotheses, and make biogeographic inferences for Hydrablabes . A second H. periops specimen, despite being seemingly similar for initial raw sequencing results and after being put through the same protocols, resulted in little usable molecular data. We discuss the successes and failures of using different pipelines and methods to maximize the products from these data and provide expectations for others who are looking to use DNA sequencing efforts on specimens that likely have degraded DNA. Life Science Identifier ( Hydrablabes periops ) urn:lsid:zoobank.org :pub:F2AA44 E2-D2EF-4747-972A-652C34C2C09D. 
    more » « less
  3. Phylogenetic datasets are now commonly generated using short-read sequencing technologies unhampered by degraded DNA, such as that often extracted from herbarium specimens. The compatibility of these methods with herbarium specimens has precipitated an increase in broad sampling of herbarium specimens for inclusion in phylogenetic studies. Understanding which sample characteristics are predictive of sequencing success can guide researchers in the selection of tissues and specimens most likely to yield good results. Multiple recent studies have considered the relationship between sample characteristics and DNA yield and sequence capture success. Here we report an analysis of the relationship between sample characteristics and sequencing success for nearly 8,000 herbarium specimens. This study, the largest of its kind, is also the first to include a measure of specimen quality (“greenness”) as a predictor of DNA sequencing success. We found that taxonomic group and source herbarium are strong predictors of both DNA yield and sequencing success and that the most important specimen characteristics for predicting success differ for DNA yield and sequencing: greenness was the strongest predictor of DNA yield, and age was the strongest predictor of proportion-on-target reads recovered. Surprisingly, the relationship between age and proportion-on-target reads is the inverse of expectations; older specimens performed slightly better in our capture-based protocols. We also found that DNA yield itself is not a strong predictor of sequencing success. Most literature on DNA sequencing from herbarium specimens considers specimen selection for optimal DNA extraction success, which we find to be an inappropriate metric for predicting success using next-generation sequencing technologies. 
    more » « less
  4. Abstract The field of plant genome sequencing has grown rapidly in the past 20 years, leading to increases in the quantity and quality of publicly available genomic resources. The growing wealth of genomic data from an increasingly diverse set of taxa provides unprecedented potential to better understand the genome biology and evolution of land plants. Here we provide a contemporary view of land plant genomics, including analyses on assembly quality, taxonomic distribution of sequenced species and national participation. We show that assembly quality has increased dramatically in recent years, that substantial taxonomic gaps exist and that the field has been dominated by affluent nations in the Global North and China, despite a wide geographic distribution of study species. We identify numerous disconnects between the native range of focal species and the national affiliation of the researchers studying them, which we argue are rooted in colonialism—both past and present. Luckily, falling sequencing costs, widening availability of analytical tools and an increasingly connected scientific community provide key opportunities to improve existing assemblies, fill sampling gaps and empower a more global plant genomics community. 
    more » « less
  5. The hagfishes (Myxiniformes) arose from agnathan (jawless vertebrate) lineages and they are one of only two extant cyclostome taxa, together with lampreys (Petromyzontiformes). Even though whole genome sequencing has been achieved for diverse vertebrate taxa, genome-wide sequence information has been highly limited for cyclostomes. Here we sequenced the genome of the inshore hagfish Eptatretus burgeri using DNA extracted from the testis, with a short-read sequencing platform, aiming to reconstruct a high-coverage protein-coding gene catalogue. The obtained genome assembly, scaffolded with mate-pair reads and paired RNA-seq reads, exhibited an N50 scaffold length of 293 Kbp, which allowed the genome-wide prediction of coding genes. This computation resulted in the gene models whose completeness was estimated at the complete coverage of more than 83 % and the partial coverage of more than 93 % by referring to evolutionarily conserved single-copy orthologs. The high contiguity of the assembly and completeness of the gene models promise a high utility in various comparative analyses including phylogenomics and phylome exploration. 
    more » « less