Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract SummaryAdmixture is a fundamental process that has shaped levels and patterns of genetic variation in human populations. RFMIX version 2 (RFMIX2) utilizes a robust modeling approach to identify the genetic ancestries in admixed populations. However, this software does not have a built-in method to visually summarize the results of analyses. Here, we introduce the AncestryGrapher toolkit, which converts the numerical output of RFMIX2 into graphical representations of global and local ancestry (i.e. the per-individual ancestry components and the genetic ancestry along chromosomes, respectively). ResultsTo demonstrate the utility of our methods, we applied the AncestryGrapher toolkit to visualize the global and local ancestry of individuals in the North African Mozabite Berber population from the Human Genome Diversity Panel. Our results showed that the Mozabite Berbers derived their ancestry from the Middle East, Europe, and sub-Saharan Africa (global ancestry). We also found that the population origin of ancestry varied considerably along chromosomes (local ancestry). For example, we observed variance in local ancestry in the genomic region on Chromosome 2 containing the regulatory sequence in the MCM6 gene associated with lactase persistence, a human trait tied to the cultural development of adult milk consumption. Overall, the AncestryGrapher toolkit facilitates the exploration, interpretation, and reporting of ancestry patterns in human populations. Availability and implementationThe AncestryGrapher toolkit is free and open source on https://github.com/alisi1989/RFmix2-Pipeline-to-plot.more » « less
-
Statistical methods that measure the extent of haplotype homozygosity on chromosomes have been highly informative for identifying episodes of recent selection. For example, the integrated haplotype score (iHS) and the extended haplotype homozygosity (EHH) statistics detect long-range haplotype structure around derived and ancestral alleles indicative of classic and soft selective sweeps, respectively. However, to our knowledge, there are currently no publicly available methods that classify ancestral and derived alleles in genomic datasets for the purpose of quantifying the extent of haplotype homozygosity. Here, we introduce the Polaris package, which polarizes chromosomal variants into ancestral and derived alleles and creates corresponding genetic maps for analysis by selscan and HaploSweep, two versatile haplotype-based programs that perform scans for selection. With the input files generated by Polaris, selscan and/or HaploSweep can produce the appropriate sign (either positive or negative) for outlier iHS statistics, enabling users to distinguish between selection on derived or ancestral alleles. In addition, Polaris can convert the numerical output of these analyses into graphical representations of selective sweeps, increasing the functionality of our software. To demonstrate the utility of our approach, we applied the Polaris package to Chromosome 2 in the European Finnish, Middle Eastern Bedouin, and East African Maasai populations. More specifically, we examined the regulatory sequence in intron 13 of the MCM6 gene associated with lactase persistence (i.e. the ability to digest the lactose sugar present in fresh milk), a region of intense interest to human evolutionary geneticists. Our analyses showed that derived alleles (at known enhancers for lactase expression) sit on an extended haplotype background in the Finnish, Bedouin, and Maasai consistent with a classic selective sweep model as determined by iHS and EHH statistics. Importantly, we were able to immediately identify this target allele under selection based on the information generated by our software. We also explored outlier statistics across Chromosome 2 in two distinct datasets from these populations: (i) one containing polarized alleles generated with Polaris and (ii) the other containing unpolarized alleles in the original phased vcf file. Here, we found an excess of outlier statistics on Chromosome 2 in the unpolarized datasets, raising the possibility that a subset of these “hits” of selection may be unreliable. Overall, Polaris is a versatile package that enables users to efficiently explore, interpret, and report signals of recent selection in genomic datasets.more » « lessFree, publicly-accessible full text available April 12, 2026
An official website of the United States government
