skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Integrating Linguistics, Social Structure, and Geography to Model Genetic Diversity within India
Abstract India represents an intricate tapestry of population substructure shaped by geography, language, culture, and social stratification. Although geography closely correlates with genetic structure in other parts of the world, the strict endogamy imposed by the Indian caste system and the large number of spoken languages add further levels of complexity to understand Indian population structure. To date, no study has attempted to model and evaluate how these factors have interacted to shape the patterns of genetic diversity within India. We merged all publicly available data from the Indian subcontinent into a data set of 891 individuals from 90 well-defined groups. Bringing together geography, genetics, and demographic factors, we developed Correlation Optimization of Genetics and Geodemographics to build a model that explains the observed population genetic substructure. We show that shared language along with social structure have been the most powerful forces in creating paths of gene flow in the subcontinent. Furthermore, we discover the ethnic groups that best capture the diverse genetic substructure using a ridge leverage score statistic. Integrating data from India with a data set of additional 1,323 individuals from 50 Eurasian populations, we find that Indo-European and Dravidian speakers of India show shared genetic drift with Europeans, whereas the Tibeto-Burman speaking tribal groups have maximum shared genetic drift with East Asians.  more » « less
Award ID(s):
2006929 1661756
PAR ID:
10248338
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Heyer, Evelyne
Date Published:
Journal Name:
Molecular Biology and Evolution
Volume:
38
Issue:
5
ISSN:
1537-1719
Page Range / eLocation ID:
1809 to 1819
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The Chihuahuan Desert includes many endemic angiosperm species, some having very restricted geographic ranges. One of these species is Oreocarya crassipes (I. M. Johnst.) Hasenstab & M. G. Simpson, an endangered distylous gypsophile from the Trans-Pecos region in southern Brewster County, Texas, USA. The species is known from 10 populations, and this small number of populations, human development in the area, a distylous breeding system, and edaphic requirements threaten the long-term viability of the species. Using both hundreds of single nucleotide polymorphisms identified via tunable genotyping-by-sequencing (tGBS) and 10 microsatellite loci, patterns of genetic diversity, demography, selection, and migration were examined for 192 individuals from four populations of O. crassipes. From the sampled individuals, two populations (clusters) were identified via multiple methodologies and with both types of data. With SNP data, population substructure was further resolved among one of these populations to identify two distinct groups of individuals. Multiple individuals recognized as having mixed ancestry, along with Fst values and AMOVA results, provide evidence of genetic exchange among populations, which is less common for gypsophiles than non-gypsophiles, and the rate of migration among populations has been increasing recently. The Fst values for O. crassipes are more similar to those of other rare species than to other gypsophiles. Additionally, while distyly specifically does not necessarily impact the population genetics of the species, allogamy, which is facilitated by distyly, seems to have played a role in the genetic structure of O. crassipes. 
    more » « less
  2. Paleogenomic data are commonly used to test archaeological questions, but paleogenomics best informs ancient population histories when applied using a biocultural approach that contextualizes genomic analyses within a socio-historic framework. In this study, we focus on hunter-gatherer populations from southern Patagonia and Tierra del Fuego (Fuego-Patagonia) who have inhabited the region for more than 10,500 years. These groups practiced subsistence strategies that relied on either marine or terrestrial resources, or a mix of both. Some prior studies suggest that Marine and Terrestrial groups descended from the same ancestral group, while others indicate they had distinct ancestries. Here, we examined genome-wide data from 40 newly sequenced and 30 previously reported ancient Fuegian-Patagonians spanning 6,800 years, as well as sociocultural evidence from archaeological, ethnohistorical, and linguistic sources. Previous analyses of the newly sequenced individuals suggested that Marine and Terrestrial groups had distinct ancestries that diverged ~14,000 years ago. To further elucidate the genetic relationships among Terrestrial, Marine, and Mixed Economy groups, we examined population substructure using PCA, ADMIXTURE, tests of genetic distances, and f-statistics. We found that some Marine and Mixed Economy individuals from the Middle Holocene shared ancestry with Late Holocene Terrestrial groups, while Terrestrial and Marine groups from the Late Holocene showed distinct ancestries and limited admixture until Historic times. We contextualized these paleogenomic results with evidence from sociocultural sources, adding further nuance and justiܪcation to our conclusions. This study highlights the complexities of local population histories and demonstrates the importance of including sociocultural data in paleogenomic studies. 
    more » « less
  3. Cretaceous dinosaurs were first reported from the Indian subcontinent in the late 1800s, and titanosaur sauropod and abelisauroid theropod remains are now known from central, western, and southern parts of India and from central western Pakistan. Although dinosaur remains are abundant, associated or articulated specimens are extremely rare, and so are complex skeletal elements such as cranial bones and presacral vertebrae. The historical pattern of sampling and collecting has limited the inferences about patterns of diversity, phylogenetic affinity, and paleobiogeographic relationships of Indian dinosaurs. Here we report on three titanosaur vertebrae representing regions of the skeleton that are complex and otherwise poorly represented in the Indian record, including two anterior dorsal vertebrae pertaining to a single individual from Rahioli, in Gujarat State (western India), and an anterior caudal neural arch from Bara Simla, in Madhya Pradesh State (central India). Phylogenetic analysis places the two individuals within Titanosauria, but further resolution of their affinities is precluded by their incompleteness and that of titanosaur vertebral columns in general, lack of coding of character data for titanosaur presacral and anterior caudal vertebrae, and relatively coarse understanding of the evolutionary relationships of titanosaurs. Comparisons with contemporaneous and spatially proximal titanosaurs from Indo-Pakistan, Madagascar, and South America provide insights into their affinities. The dorsal vertebrae share close affinity with Isisaurus from India and Mendozasaurus from Argentina. Few local comparisons are available for the anterior caudal vertebra, which shares characteristics with Tengrisaurus from the Early Cretaceous of Russia. 
    more » « less
  4. null (Ed.)
    Abstract The ‘Out of India’ hypothesis is often invoked to explain patterns of distribution among Southeast Asian taxa. According to this hypothesis, Southeast Asian taxa originated in Gondwana, diverged from their Gondwanan relatives when the Indian subcontinent rifted from Gondwana in the Late Jurassic, and colonized Southeast Asia when it collided with Eurasia in the early Cenozoic. A growing body of evidence suggests these events were far more complex than previously understood, however. The first quantitative reconstruction of the biogeography of Asian forest scorpions (Scorpionidae Latreille, 1802: Heterometrinae Simon, 1879) is presented here. Divergence time estimation, ancestral range estimation, and diversification analyses are used to determine the origins, dispersal and diversification patterns of these scorpions, providing a timeline for their biogeographical history that can be summarized into four major events. (1) Heterometrinae diverged from other Scorpionidae on the African continent after the Indian subcontinent became separated in the Cretaceous. (2) Environmental stresses during the Cretaceous–Tertiary (KT) mass extinction caused range contraction, restricting one clade of Heterometrinae to refugia in southern India (the Western Ghats) and Sri Lanka (the Central Highlands). (3) Heterometrinae dispersed to Southeast Asia three times during India’s collision with Eurasia, the first dispersal event occurring as the Indian subcontinent brushed up against the western side of Sumatra, and the other two events occurring as India moved closer to Eurasia. (4) Indian Heterometrinae, confined to southern India and Sri Lanka during the KT mass extinction, recolonized the Deccan Plateau and northern India, diversifying into new, more arid habitats after environmental conditions stabilized. These hypotheses, which are congruent with the geological literature and biogeographical analyses of other taxa from South and Southeast Asia, contribute to an improved understanding of the dispersal and diversification patterns of taxa in this biodiverse and geologically complex region. 
    more » « less
  5. Despite an increased focus on multiscale relationships and interdisciplinary integration, few macroecological studies consider the contribution of genetic-based processes to landscape-scale patterns. We test the hypothesis that tree genetics, climate, and geography jointly drive continental-scale patterns of community structure, using genome-wide SNP data from a broadly distributed foundation tree species (Populus fremontii S. Watson) and two dependent communities (leaf-modifying arthropods and fungal endophytes) spanning southwestern North America. Four key findings emerged: (1) Tree genetic structure was a significant predictor for both communities; however, the strength of influence was both scale- and community-dependent. (2) Tree genetics was the primary driver for endophytes, explaining 17% of variation in continental-scale community structure, whereas (3) climate was the strongest predictor of arthropod structure (24%). (4) Power to detect tree genotype—community phenotype associations changed with scale of genetic organization, increasing from individuals to populations to ecotypes, emphasizing the need to consider nonstationarity (i.e., changes in the effects of factors on ecological processes across scales) when inferring macrosystem properties. Our findings highlight the role of foundation tree species as drivers of macroscale community structure and provide macrosystems ecology with a theoretical framework for linking fine- and intermediate-scale genetic processes to landscape-scale patterns. Management of the genetic diversity harbored within foundation species is a critical consideration for conserving and sustaining regional biodiversity. 
    more » « less