Abstract ObjectivesMost research in human dental age estimation has focused on point estimates of age, and most research on dental development theories has focused on morphology or eruption. Correlations between developing teeth using ordinal staging have received less attention. The effect of demographic variables on these correlations is unknown. I tested the effect of reference sample demographic variables on the residual correlation matrix using the lens of cooperative genetic interaction (CGI). Materials and MethodsThe sample consisted of Moorrees et al.,Journal of Dental Research, 1963, 42, 1490–1502, scores of left mandibular permanent teeth from panoramic radiographs of 880 London children 3–22.99 years of age stratified by year of age, sex, and Bangladeshi or European ancestry. A multivariate cumulative probit model was fit to each sex/ancestry group (n = 220), each sex or ancestry (n = 440), and all individuals (n = 880). Residual correlation matrices from nine reference sample configurations were compared using Bartlett's tests of between‐sample difference matrices against the identity matrix, hierarchical cluster analysis, and dendrogram cophenetic correlations. ResultsBartlett's test results were inconclusive. Cluster analysis showed clustering by tooth class, position within class, and developmental timing. Clustering patterns and dendrogram correlations showed similarity by sex but not ancestry. DiscussionExpectations of CGI were supported for developmental staging. This supports using CGI as a model for explaining patterns of variation within the dentition. Sex was found to produce consistent patterns of dental correlations, whereas ancestry did not. Clustering by timing of development supports phenotypic plasticity in the dentition and suggests shared environment over genetic ancestry to explain population differences.
more »
« less
AncestryGrapher toolkit: Python command-line pipelines to visualize global- and local- ancestry inferences from the RFMIX version 2 software
Abstract SummaryAdmixture is a fundamental process that has shaped levels and patterns of genetic variation in human populations. RFMIX version 2 (RFMIX2) utilizes a robust modeling approach to identify the genetic ancestries in admixed populations. However, this software does not have a built-in method to visually summarize the results of analyses. Here, we introduce the AncestryGrapher toolkit, which converts the numerical output of RFMIX2 into graphical representations of global and local ancestry (i.e. the per-individual ancestry components and the genetic ancestry along chromosomes, respectively). ResultsTo demonstrate the utility of our methods, we applied the AncestryGrapher toolkit to visualize the global and local ancestry of individuals in the North African Mozabite Berber population from the Human Genome Diversity Panel. Our results showed that the Mozabite Berbers derived their ancestry from the Middle East, Europe, and sub-Saharan Africa (global ancestry). We also found that the population origin of ancestry varied considerably along chromosomes (local ancestry). For example, we observed variance in local ancestry in the genomic region on Chromosome 2 containing the regulatory sequence in the MCM6 gene associated with lactase persistence, a human trait tied to the cultural development of adult milk consumption. Overall, the AncestryGrapher toolkit facilitates the exploration, interpretation, and reporting of ancestry patterns in human populations. Availability and implementationThe AncestryGrapher toolkit is free and open source on https://github.com/alisi1989/RFmix2-Pipeline-to-plot.
more »
« less
- Award ID(s):
- 2221920
- PAR ID:
- 10553569
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 40
- Issue:
- 11
- ISSN:
- 1367-4811
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract AimNatural selection typically results in the homogenization of reproductive traits, reducing natural variation within populations; thus, highly polymorphic species present unresolved questions regarding the mechanisms that shape and maintain gene flow given a diversity of phenotypes. We used an integrative framework to characterize phenotypic diversity and assess how evolutionary history and population genetics affect the highly polymorphic nature of a California endemic lily. LocationCalifornia, United States. TaxonButterfly mariposa lily,Calochortus venustus(Liliaceae). MethodsWe summarized phenotypic diversity at both metapopulation and subpopulation scales to explore spatial phenotypic distributions. We sampled 174 individuals across the species range representing multiple samples for each population and each phenotype. We used restriction‐site‐associated DNA sequencing (RAD‐Seq) to detect population clusters, gene flow between phenotypes and between populations, infer haplotype networks, and reconstruct ancestral range evolution to infer historical migration and range expansion. ResultsPolymorphic floral traits within the species such as petal pigmentation and distal spots are geographically structured, and inferred evolutionary history is consistent with a ring species pattern involving a complex of populations having experienced sequential change in genetic and phenotypic variation from the founding population. Populations remain interconnected yet have differentiated from each other along a bifurcating south‐to‐north range expansion, consequently indicating parallel evolution towards the white morphotype in the northern range. Thus, our phylogeographical analyses reveal morphological convergence with population genetic cohesion irrespective of phenotypic diversity. Main conclusionsPhenotypic variation in the highly polymorphicCalochortus venustusis not due to genetic differentiation between phenotypes; rather there is genetic cohesion within six geographically defined populations, some of which maintain a high level of within‐population phenotypic diversity. Our results demonstrate that analyses of polymorphic taxa greatly benefit from disentangling phenotype from genotype at various spatial scales. We discuss results in light of ring species concepts and the need to determine the adaptive significance of the patterns we report.more » « less
-
Abstract The rapid improvements in genomic sequencing technology have led to the proliferation of locally collected genomic datasets. Given the sensitivity of genomic data, it is crucial to conduct collaborative studies while preserving the privacy of the individuals. However, before starting any collaborative research effort, the quality of the data needs to be assessed. One of the essential steps of the quality control process is population stratification: identifying the presence of genetic difference in individuals due to subpopulations. One of the common methods used to group genomes of individuals based on ancestry is principal component analysis (PCA). In this article, we propose a privacy-preserving framework which utilizes PCA to assign individuals to populations across multiple collaborators as part of the population stratification step. In our proposed client-server-based scheme, we initially let the server train a global PCA model on a publicly available genomic dataset which contains individuals from multiple populations. The global PCA model is later used to reduce the dimensionality of the local data by each collaborator (client). After adding noise to achieve local differential privacy (LDP), the collaborators send metadata (in the form of their local PCA outputs) about their research datasets to the server, which then aligns the local PCA results to identify the genetic differences among collaborators’ datasets. Our results on real genomic data show that the proposed framework can perform population stratification analysis with high accuracy while preserving the privacy of the research participants.more » « less
-
Abstract ObjectivesSince 2010, genome‐wide data from hundreds of ancient Native Americans have contributed to the understanding of Americas' prehistory. However, these samples have never been studied as a single dataset, and distinct relationships among themselves and with present‐day populations may have never come to light. Here, we reassess genomic diversity and population structure of 223 ancient Native Americans published between 2010 and 2019. Materials and MethodsThe genomic data from ancient Americas was merged with a worldwide reference panel of 278 present‐day genomes from the Simons Genome Diversity Project and then analyzed through ADMIXTURE,D‐statistics, PCA, t‐SNE, and UMAP. ResultsWe find largely similar population structures in ancient and present‐day Americas. However, the population structure of contemporary Native Americans, traced here to at least 10,000 years before present, is noticeably less diverse than their ancient counterparts, a possible outcome of the European contact. Additionally, in the past there were greater levels of population structure in North than in South America, except for ancient Brazil, which harbors comparatively high degrees of structure. Moreover, we find a component of genetic ancestry in the ancient dataset that is closely related to that of present‐day Oceanic populations but does not correspond to the previously reported Australasian signal. Lastly, we report an expansion of the Ancient Beringian ancestry, previously reported for only one sample. DiscussionOverall, our findings support a complex scenario for the settlement of the Americas, accommodating the occurrence of founder effects and the emergence of ancestral mixing events at the regional level.more » « less
-
Abstract Non‐random mating among individuals can lead to spatial clustering of genetically similar individuals and population stratification. This deviation from panmixia is commonly observed in natural populations. Consequently, individuals can have parentage in single populations or involving hybridization between differentiated populations. Accounting for this mixture and structure is important when mapping the genetics of traits and learning about the formative evolutionary processes that shape genetic variation among individuals and populations. Stratified genetic relatedness among individuals is commonly quantified using estimates of ancestry that are derived from a statistical model. Development of these models for polyploid and mixed‐ploidy individuals and populations has lagged behind those for diploids. Here, we extend and test a hierarchical Bayesian model, calledentropy, which can use low‐depth sequence data to estimate genotype and ancestry parameters in autopolyploid and mixed‐ploidy individuals (including sex chromosomes and autosomes within individuals). Our analysis of simulated data illustrated the trade‐off between sequencing depth and genome coverage and found lower error associated with low‐depth sequencing across a larger fraction of the genome than with high‐depth sequencing across a smaller fraction of the genome. The model has high accuracy and sensitivity as verified with simulated data and through analysis of admixture among populations of diploid and tetraploidArabidopsis arenosa.more » « less
An official website of the United States government
