Background Large (>1 Mb), polymorphic inversions have substantial impacts on population structure and maintenance of genotypes. These large inversions can be detected from single nucleotide polymorphism (SNP) data using unsupervised learning techniques like PCA. Construction and analysis of a feature matrix from millions of SNPs requires large amount of memory and limits the sizes of data sets that can be analyzed. Methods We propose using feature hashing construct a feature matrix from a VCF file of SNPs for reducing memory usage. The matrix is constructed in a streaming fashion such that the entire VCF file is never loaded into memory at one time. Results When evaluated on Anopheles mosquito and Drosophila fly data sets, our approach reduced memory usage by 97% with minimal reductions in accuracy for inversion detection and localization tasks. Conclusion With these changes, inversions in larger data sets can be analyzed easily and efficiently on common laptop and desktop computers. Our method is publicly available through our open-source inversion analysis software, Asaph.
more »
« less
Segmenting and Genotyping Large, Polymorphic Inversions
Large, polymorphic inversions can contribute to population structure and enable mutually-exclusive adaptations to survive in the same population. Current methods for detecting inversions from single-nucleotide polymorphisms (SNPs) called from population genomics data require an experienced, human user to prepare the data and interpret the results. Ideally, these methods would be completely automated yet robust to allow usage by inexperienced users. Towards this goal, automated approaches for segmentation of inversions and inference of sample genotypes are introduced and evaluated on chromosomes from flies, mosquitoes, and prairie sunflowers.
more »
« less
- Award ID(s):
- 1947257
- PAR ID:
- 10463483
- Date Published:
- Journal Name:
- 2023 IEEE International Conference on Electro Information Technology (eIT)
- Page Range / eLocation ID:
- 153 to 162
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Across many species where inversions have been implicated in local adaptation, genomes often evolve to contain multiple, large inversions that arise early in divergence. Why this occurs has yet to be resolved. To address this gap, we built forward-time simulations in which inversions have flexible characteristics and can invade a metapopulation undergoing spatially divergent selection for a highly polygenic trait. In our simulations, inversions typically arose early in divergence, captured standing genetic variation upon mutation, and then accumulated many small-effect loci over time. Under special conditions, inversions could also arise late in adaptation and capture locally adapted alleles. Polygenic inversions behaved similarly to a single supergene of large effect and were detectable by genome scans. Our results show that characteristics of adaptive inversions found in empirical studies (e.g. multiple large, old inversions that are F ST outliers, sometimes overlapping with other inversions) are consistent with a highly polygenic architecture, and inversions do not need to contain any large-effect genes to play an important role in local adaptation. By combining a population and quantitative genetic framework, our results give a deeper understanding of the specific conditions needed for inversions to be involved in adaptation when the genetic architecture is polygenic. This article is part of the theme issue ‘Genomic architecture of supergenes: causes and evolutionary consequences’.more » « less
-
Abstract Acoustic source inversions estimate the mass flow rate of volcanic explosions or yield of chemical explosions and provide insight into potential source directionality. However, the limitations of applying these methods to complex sources and their ability to resolve a stable solution have not been investigated in detail. We perform synthetic infrasound waveform inversions that use 3‐D Green’s functions for a variety of idealized and realistic deployment scenarios using both a flat plane and Yasur volcano, Vanuatu as examples. We investigate the ability of various scenarios to retrieve the input source functions and relative amplitudes for monopole and multipole (monopole and dipole) inversions. Infrasound waveform inversions appear to be a robust method to quantify mass flow rates from simple sources (monopole) using deployments of infrasound sensors placed around a source, but care should be taken when analyzing and interpreting results from more complex acoustic sources (multipole) that have significant directional components. In the examples we consider the solution is stable for monopole inversions with a signal‐to‐noise ratio greater than five and the dipole component is small. For most scenarios investigated, the vertical dipole component of the multipole explosion source is poorly constrained and can impact the ability to recover the other source term components. Because multipole inversions are ill‐posed for many deployments, a low residual does not necessarily mean the proper source vector has been recovered. Synthetic studies can help investigate the limitations and place bounds on information that may be missing using monopole and multipole inversions for potentially directional sources.more » « less
-
Abstract Large structural variants in the genome, such as inversions, may play an important role in producing population structure and local adaptation to the environment through suppression of recombination. However, relatively few studies have linked inversions to phenotypic traits that are sexually selected and may play a role in reproductive isolation. Here, we found that geographic differences in the sexually selected plumage of a warbler, the common yellowthroat (Geothlypis trichas), are largely due to differences in the Z (sex) chromosome (males are ZZ), which contains at least one putative inversion spanning 40% (31/77 Mb) of its length. The inversions on the Z chromosome vary dramatically east and west of the Appalachian Mountains, which provides evidence of cryptic population structure within the range of the most widespread eastern subspecies (G. t. trichas). In an eastern (New York) and western (Wisconsin) population of this subspecies, female prefer different male ornaments; larger black facial masks are preferred in Wisconsin and larger yellow breasts are preferred in New York. The putative inversion also contains genes related to vision, which could influence mating preferences. Thus, structural variants on the Z chromosome are associated with geographic differences in male ornaments and female choice, which may provide a mechanism for maintaining different patterns of sexual selection in spite of gene flow between populations of the same subspecies.more » « less
-
null (Ed.)The morphometrics of fish otoliths have been commonly used to investigate population structures and the environmental impacts on ontogeny. These studies can require hundreds if not thousands of otoliths to be collected and processed. Processing these otoliths takes up valuable time, money, and resources that can be saved by automation. These structures also contain relevant information in three dimensions that is lost with 2D morphometric methods from photographic analysis. In this study, the otoliths of three populations of Coho Salmon (Oncorhynchus kisutch) were examined with manual 2D, automated 2D, and automated 3D otolith measurement methods. The automated 3D method was able to detect an 8% difference in average otolith density, while 2D methods could not. Due to the loss of information in the z-axis, and the longer processing time, 2D methods can take up to 100 times longer to reach the same statistical power as automated 3D methods. Automated 3D methods are faster, can answer a wider range of questions, and allow fisheries scientists to automate rather monotonous tasks.more » « less
An official website of the United States government

