skip to main content

Title: Population Genomics Training for the Next Generation of Conservation Geneticists: ConGen 2018 Workshop
Abstract The increasing availability and complexity of next-generation sequencing (NGS) data sets make ongoing training an essential component of conservation and population genetics research. A workshop entitled “ConGen 2018” was recently held to train researchers in conceptual and practical aspects of NGS data production and analysis for conservation and ecological applications. Sixteen instructors provided helpful lectures, discussions, and hands-on exercises regarding how to plan, produce, and analyze data for many important research questions. Lecture topics ranged from understanding probabilistic (e.g., Bayesian) genotype calling to the detection of local adaptation signatures from genomic, transcriptomic, and epigenomic data. We report on progress in addressing central questions of conservation genomics, advances in NGS data analysis, the potential for genomic tools to assess adaptive capacity, and strategies for training the next generation of conservation genomicists.
Authors:
; ; ; ; ; ; ; ; ; ; ;
Award ID(s):
1655809 1639014
Publication Date:
NSF-PAR ID:
10185993
Journal Name:
Journal of Heredity
Volume:
111
Issue:
2
Page Range or eLocation-ID:
227 to 236
ISSN:
0022-1503
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background Significant progress has been made in advancing and standardizing tools for human genomic and biomedical research. Yet, the field of next-generation sequencing (NGS) analysis for microorganisms (including multiple pathogens) remains fragmented, lacks accessible and reusable tools, is hindered by local computational resource limitations, and does not offer widely accepted standards. One such “problem areas” is the analysis of Transposon Insertion Sequencing (TIS) data. TIS allows probing of almost the entire genome of a microorganism by introducing random insertions of transposon-derived constructs. The impact of the insertions on the survival and growth under specific conditions provides precise information aboutmore »genes affecting specific phenotypic characteristics. A wide array of tools has been developed to analyze TIS data. Among the variety of options available, it is often difficult to identify which one can provide a reliable and reproducible analysis. Results Here we sought to understand the challenges and propose reliable practices for the analysis of TIS experiments. Using data from two recent TIS studies, we have developed a series of workflows that include multiple tools for data de-multiplexing, promoter sequence identification, transposon flank alignment, and read count repartition across the genome. Particular attention was paid to quality control procedures, such as determining the optimal tool parameters for the analysis and removal of contamination. Conclusions Our work provides an assessment of the currently available tools for TIS data analysis. It offers ready to use workflows that can be invoked by anyone in the world using our public Galaxy platform ( https://usegalaxy.org ). To lower the entry barriers, we have also developed interactive tutorials explaining details of TIS data analysis procedures at https://bit.ly/gxy-tis .« less
  2. Abstract Background Next-generation sequencing (NGS) is widely used for genome-wide identification and quantification of DNA elements involved in the regulation of gene transcription. Studies that generate multiple high-throughput NGS datasets require data integration methods for two general tasks: 1) generation of genome-wide data tracks representing an aggregate of multiple replicates of the same experiment; and 2) combination of tracks from different experimental types that provide complementary information regarding the location of genomic features such as enhancers. Results NGS-Integrator is a Java-based command line application, facilitating efficient integration of multiple genome-wide NGS datasets. NGS-Integrator first transforms all input data tracks usingmore »the complement of the minimum Bayes’ factor so that all values are expressed in the range [0,1] representing the probability of a true signal given the background noise. Then, NGS-Integrator calculates the joint probability for every genomic position to create an integrated track. We provide examples using real NGS data generated in our laboratory and from the mouse ENCODE database. Conclusions Our results show that NGS-Integrator is both time- and memory-efficient. Our examples show that NGS-Integrator can integrate information to facilitate downstream analyses that identify functional regulatory domains along the genome.« less
  3. Next-generation sequencing (NGS) technologies - Illumina RNA-seq, Pacific Biosciences isoform sequencing (PacBio Iso-seq), and Oxford Nanopore direct RNA sequencing (DRS) - have revealed the complexity of plant transcriptomes and their regulation at the co-/post-transcriptional level. Global analysis of mature mRNAs, transcripts from nuclear run-on assays, and nascent chromatin-bound mRNAs using short as well as full-length and single-molecule DRS reads have uncovered potential roles of different forms of RNA polymerase II during the transcription process, and the extent of co-transcriptional pre-mRNA splicing and polyadenylation. These tools have also allowed mapping of transcriptome-wide start sites in cap-containing RNAs, poly(A) site choice, poly(A)more »tail length, and RNA base modifications. The emerging theme from recent studies is that reprogramming of gene expression in response to developmental cues and stresses at the co-/post-transcriptional level likely plays a crucial role in eliciting appropriate responses for optimal growth and plant survival under adverse conditions. Although the mechanisms by which developmental cues and different stresses regulate co-/post-transcriptional splicing are largely unknown, a few recent studies indicate that the external cues target spliceosomal and splicing regulatory proteins to modulate alternative splicing. In this review, we provide an overview of recent discoveries on the dynamics and complexities of plant transcriptomes, mechanistic insights into splicing regulation, and discuss critical gaps in co-/post-transcriptional research that need to be addressed using diverse genomic and biochemical approaches.« less
  4. Abstract

    The Targeting Induced Local Lesions in Genomes (TILLING) technology is a reverse genetic strategy broadly applicable to every kind of genome and represents an attractive tool for functional genomic and agronomic applications. It consists of chemical random mutagenesis followed by high-throughput screening of point mutations in targeted genomic regions. Although multiple methods for mutation discovery in amplicons have been described, next-generation sequencing (NGS) is the tool of choice for mutation detection because it quickly allows for the analysis of a large number of amplicons. The aim of the present work was to screen a previously generated sunflower TILLING populationmore »and identify alterations in genes involved in several important and complex physiological processes. Twenty-one candidate sunflower genes were chosen as targets for the screening. The TILLING by sequencing strategy allowed us to identify multiple mutations in selected genes and we subsequently validated 16 mutations in 11 different genes through Sanger sequencing. In addition to addressing challenges posed by outcrossing, our detection and validation of mutations in multiple regulatory loci highlights the importance of this sunflower population as a genetic resource.

    « less
  5. Background: In spring of 2019, 2 positive sputum cases of Pseudomonas aeruginosa in the cardiac critical care unit (CCU) were reported to the UFHJ infection prevention (IP) department. The initial 2 cases, detected within 3 days of each other, were followed shortly by a third case. Epidemiological evidence was initially consistent with a hospital-acquired infection (HAI): 2 of the 3 patients roomed next to each other, and all 3 patients were ventilated, 2 of whom shared the same respiratory therapist. However, no other changes in routine or equipment were noted. The samples were cultured and processed using Illumina NGS technology,more »generating 1–2 million short (ie, 250-bp) reads across the P. aeruginosa genome. As an additional positive control, 8 P . aeruginosa NGS data sets, previously shown to be from a single outbreak in a UK facility, were included. Reads were mapped back to a reference sequence, and single-nucleotide polymorphisms (SNPs) between each sample and the reference were extracted. Genetic distances (ie, the number of unshared SNPs) between all UFHJ and UK samples were calculated. Genetic linkage was determined using hierarchical clustering, based on a commonly used threshold of 40 SNPs. All UFHJ patient samples were separated by >18,000 SNPs, indicating genetically distinct samples from separate sources. In contrast, UK samples were separated from each other by <16 SNPs, consistent with genetic linkage and a single outbreak. Furthermore, the UFHJ samples were separated from the UK samples by >17,000 SNPs, indicating a lack of geographical distinction of the UFHJ samples (Fig. 1). These results demonstrated that while the initial epidemiological evidence pointed towards a single HAI, the high-precision and relatively inexpensive (« less