skip to main content


Title: Recent advances in conservation and population genomics data analysis
Abstract

New computational methods and next‐generation sequencing (NGS) approaches have enabled the use of thousands or hundreds of thousands of genetic markers to address previously intractable questions. The methods and massive marker sets present both new data analysis challenges and opportunities to visualize, understand, and apply population and conservation genomic data in novel ways. The large scale and complexity of NGS data also increases the expertise and effort required to thoroughly and thoughtfully analyze and interpret data. To aid in this endeavor, a recent workshop entitled “Population Genomic Data Analysis,” also known as “ConGen 2017,” was held at the University of Montana. The ConGen workshop brought 15 instructors together with knowledge in a wide range of topics including NGS data filtering, genome assembly, genomic monitoring of effective population size, migration modeling, detecting adaptive genomic variation, genomewide association analysis, inbreeding depression, and landscape genomics. Here, we summarize the major themes of the workshop and the important take‐home points that were offered to students throughout. We emphasize increasing participation by women in population and conservation genomics as a vital step for the advancement of science. Some important themes that emerged during the workshop included the need for data visualization and its importance in finding problematic data, the effects of data filtering choices on downstream population genomic analyses, the increasing availability of whole‐genome sequencing, and the new challenges it presents. Our goal here is to help motivate and educate a worldwide audience to improve population genomic data analysis and interpretation, and thereby advance the contribution of genomics to molecular ecology, evolutionary biology, and especially to the conservation of biodiversity.

 
more » « less
Award ID(s):
1655809 1639014
NSF-PAR ID:
10245628
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Evolutionary Applications
Volume:
11
Issue:
8
ISSN:
1752-4571
Page Range / eLocation ID:
p. 1197-1211
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The increasing availability and complexity of next-generation sequencing (NGS) data sets make ongoing training an essential component of conservation and population genetics research. A workshop entitled “ConGen 2018” was recently held to train researchers in conceptual and practical aspects of NGS data production and analysis for conservation and ecological applications. Sixteen instructors provided helpful lectures, discussions, and hands-on exercises regarding how to plan, produce, and analyze data for many important research questions. Lecture topics ranged from understanding probabilistic (e.g., Bayesian) genotype calling to the detection of local adaptation signatures from genomic, transcriptomic, and epigenomic data. We report on progress in addressing central questions of conservation genomics, advances in NGS data analysis, the potential for genomic tools to assess adaptive capacity, and strategies for training the next generation of conservation genomicists. 
    more » « less
  2. Abstract

    The boom of massive parallel sequencing (MPS) technology and its applications in conservation of natural and managed populations brings new opportunities and challenges to meet the scientific questions that can be addressed. Genomic conservation offers a wide range of approaches and analytical techniques, with their respective strengths and weaknesses that rely on several implicit assumptions. However, finding the most suitable approaches and analysis regarding our scientific question are often difficult and time‐consuming. To address this gap, a recent workshop entitled ‘ConGen 2015’ was held at Montana University in order to bring together the knowledge accumulated in this field and to provide training in conceptual and practical aspects of data analysis applied to the field of conservation and evolutionary genomics. Here, we summarize the expertise yield by each instructor that has led us to consider the importance of keeping in mind the scientific question from sampling to management practices along with the selection of appropriate genomics tools and bioinformatics challenges.

     
    more » « less
  3. Abstract

    The Targeting Induced Local Lesions in Genomes (TILLING) technology is a reverse genetic strategy broadly applicable to every kind of genome and represents an attractive tool for functional genomic and agronomic applications. It consists of chemical random mutagenesis followed by high-throughput screening of point mutations in targeted genomic regions. Although multiple methods for mutation discovery in amplicons have been described, next-generation sequencing (NGS) is the tool of choice for mutation detection because it quickly allows for the analysis of a large number of amplicons. The aim of the present work was to screen a previously generated sunflower TILLING population and identify alterations in genes involved in several important and complex physiological processes. Twenty-one candidate sunflower genes were chosen as targets for the screening. The TILLING by sequencing strategy allowed us to identify multiple mutations in selected genes and we subsequently validated 16 mutations in 11 different genes through Sanger sequencing. In addition to addressing challenges posed by outcrossing, our detection and validation of mutations in multiple regulatory loci highlights the importance of this sunflower population as a genetic resource.

     
    more » « less
  4. Abstract

    Long-read sequencing is revolutionizingde-novogenome assemblies, with continued advancements making it more readily available for previously understudied, non-model organisms. Stony corals are one such example, with long-readde-novogenome assemblies now starting to be publicly available, opening the door for a wide array of ‘omics-based research. Here we present a newde-novogenome assembly for the endangered Caribbean star coral,Orbicella faveolata, using PacBio circular consensus reads. Our genome assembly improved the contiguity (51 versus 1,933 contigs) and complete and single copy BUSCO orthologs (93.6% versus 85.3%, database metazoa_odb10), compared to the currently available reference genome generated using short-read methodologies. Our newde-novoassembled genome also showed comparable quality metrics to other coral long-read genomes. Telomeric repeat analysis identified putative chromosomes in our scaffolded assembly, with these repeats at either one, or both ends, of scaffolded contigs. We identified 32,172 protein coding genes in our assembly through use of long-read RNA sequencing (ISO-seq) of additionalO. faveolatafragments exposed to a range of abiotic and biotic treatments, and publicly available short-read RNA-seq data. With anthropogenic influences heavily affectingO. faveolata, as well as itsincreasing incorporation into reef restoration activities, this updated genome resource can be used for population genomics and other ‘omics analyses to aid in the conservation of this species.

     
    more » « less
  5. Abstract

    The capability to generate densely sampled single nucleotide polymorphism (SNP) data is essential in diverse subdisciplines of biology, including crop breeding, pathology, forensics, forestry, ecology, evolution and conservation. However, the wet‐laboratory expertise and bioinformatics training required to conduct genome‐scale variant discovery remain limiting factors for investigators with limited resources.

    Here we present ISSRseq, a PCR‐based method for reduced representation of genomic variation using simple sequence repeats as priming sites to sequence inter simple sequence repeat (ISSR) regions. Briefly, ISSR regions are amplified with single primers, pooled, used to construct sequencing libraries with a commercially available kit, and sequenced on the Illumina platform. We also present a flexible bioinformatic pipeline that assembles ISSR loci, calls and hard filters variants, outputs data matrices in common formats, and conducts population analyses using R.

    Using three angiosperm species as case studies, we demonstrate that ISSRseq is highly repeatable, necessitates only simple wet‐laboratory skills and commonplace instrumentation, is flexible in terms of the number of single primers used, and can generate genomic‐scale variant discovery on par with existing RRS methods which require more complex wet‐laboratory procedures.

    ISSRseq represents a straightforward approach to SNP genotyping in any organism, and we predict that this method will be particularly useful for those studying population genomics and phylogeography of non‐model organisms. Furthermore, the ease of ISSRseq relative to other RRS methods should prove useful to those lacking advanced expertise in wet‐laboratory methods or bioinformatics.

     
    more » « less