skip to main content


Title: Comparing genome‐based estimates of relatedness for use in pedigree‐based conservation management
Abstract

Researchers have long debated which estimator of relatedness best captures the degree of relationship between two individuals. In the genomics era, this debate continues, with relatedness estimates being sensitive to the methods used to generate markers, marker quality, and levels of diversity in sampled individuals. Here, we compare six commonly used genome‐based relatedness estimators (kinship genetic distance [KGD], Wang maximum likelihood [TrioML], Queller and Goodnight [Rxy], Kinship INference for Genome‐wide association studies [KING‐robust), and pairwise relatedness [RAB], allele‐sharing coancestry [AS]) across five species bred in captivity–including three birds and two mammals–with varying degrees of reliable pedigree data, using reduced‐representation and whole genome resequencing data. Genome‐based relatedness estimates varied widely across estimators, sequencing methods, and species, yet the most consistent results for known first order relationships were found usingRxy,RAB, and AS. However, AS was found to be less consistently correlated with known pedigree relatedness than eitherRxyorRAB. Our combined results indicate there is not a single genome‐based estimator that is ideal across different species and data types. To determine the most appropriate genome‐based relatedness estimator for each new data set, we recommend assessing the relative: (1) correlation of candidate estimators with known relationships in the pedigree and (2) precision of candidate estimators with known first‐order relationships. These recommendations are broadly applicable to conservation breeding programmes, particularly where genome‐based estimates of relatedness can complement and complete poorly pedigreed populations. Given a growing interest in the application of wild pedigrees, our results are also applicable to in situ wildlife management.

 
more » « less
Award ID(s):
1826801
NSF-PAR ID:
10371086
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Molecular Ecology Resources
Volume:
22
Issue:
7
ISSN:
1755-098X
Page Range / eLocation ID:
p. 2546-2558
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    Estimation of genetic relatedness, or kinship, is used occasionally for recreational purposes and in forensic applications. While numerous methods were developed to estimate kinship, they suffer from high computational requirements and often make an untenable assumption of homogeneous population ancestry of the samples. Moreover, genetic privacy is generally overlooked in the usage of kinship estimation methods. There can be ethical concerns about finding unknown familial relationships in third-party databases. Similar ethical concerns may arise while estimating and reporting sensitive population-level statistics such as inbreeding coefficients for the concerns around marginalization and stigmatization.

    Results

    Here, we present SIGFRIED, which makes use of existing reference panels with a projection-based approach that simplifies kinship estimation in the admixed populations. We use simulated and real datasets to demonstrate the accuracy and efficiency of kinship estimation. We present a secure federated kinship estimation framework and implement a secure kinship estimator using homomorphic encryption-based primitives for computing relatedness between samples in two different sites while genotype data are kept confidential. Source code and documentation for our methods can be found at https://doi.org/10.5281/zenodo.7053352.

    Conclusions

    Analysis of relatedness is fundamentally important for identifying relatives, in association studies, and for estimation of population-level estimates of inbreeding. As the awareness of individual and group genomic privacy is growing, privacy-preserving methods for the estimation of relatedness are needed. Presented methods alleviate the ethical and privacy concerns in the analysis of relatedness in admixed, historically isolated and underrepresented populations.

    Short Abstract

    Genetic relatedness is a central quantity used for finding relatives in databases, correcting biases in genome wide association studies and for estimating population-level statistics. Methods for estimating genetic relatedness have high computational requirements, and occasionally do not consider individuals from admixed ancestries. Furthermore, the ethical concerns around using genetic data and calculating relatedness are not considered. We present a projection-based approach that can efficiently and accurately estimate kinship. We implement our method using encryption-based techniques that provide provable security guarantees to protect genetic data while kinship statistics are computed among multiple sites.

     
    more » « less
  2. Abstract

    Kinship plays a fundamental role in the evolution of social systems and is considered a key driver of group living. To understand the role of kinship in the formation and maintenance of social bonds, accurate measures of genetic relatedness are critical. Genotype‐by‐sequencing technologies are rapidly advancing the accuracy and precision of genetic relatedness estimates for wild populations. The ability to assign kinship from genetic data varies depending on a species’ or population's mating system and pattern of dispersal, and empirical data from longitudinal studies are crucial to validate these methods. We use data from a long‐term behavioural study of a polygynandrous, bisexually philopatric marine mammal to measure accuracy and precision of parentage and genetic relatedness estimation against a known partial pedigree. We show that with moderate but obtainable sample sizes of approximately 4,235 SNPs and 272 individuals, highly accurate parentage assignments and genetic relatedness coefficients can be obtained. Additionally, we subsample our data to quantify how data availability affects relatedness estimation and kinship assignment. Lastly, we conduct a social network analysis to investigate the extent to which accuracy and precision of relatedness estimation improve statistical power to detect an effect of relatedness on social structure. Our results provide practical guidance for minimum sample sizes and sequencing depth for future studies, as well as thresholds for post hoc interpretation of previous analyses.

     
    more » « less
  3. Abstract

    Information on genetic relationships among individuals is essential to many studies of the behaviour and ecology of wild organisms. Parentage and relatedness assays based on large numbers of single nucleotide polymorphism (SNP) loci hold substantial advantages over the microsatellite markers traditionally used for these purposes. We present a double‐digest restriction site‐associated DNA sequencing (ddRAD‐seq) analysis pipeline that, as such, simultaneously achieves the SNP discovery and genotyping steps and which is optimized to return a statistically powerful set of SNP markers (typically 150–600 after stringent filtering) from large numbers of individuals (up to 240 per run). We explore the trade‐offs inherent in this approach through a set of experiments in a species with a complex social system, the variegated fairy‐wren (Malurus lamberti) and further validate it in a phylogenetically broad set of other bird species. Through direct comparisons with a parallel data set from a robust panel of highly variable microsatellite markers, we show that this ddRAD‐seq approach results in substantially improved power to discriminate among potential relatives and considerably more precise estimates of relatedness coefficients. The pipeline is designed to be universally applicable to all bird species (and with minor modifications to many other taxa), to be cost‐ and time‐efficient, and to be replicable across independent runs such that genotype data from different study periods can be combined and analysed as field samples are accumulated.

     
    more » « less
  4. Abstract

    For species of management concern, accurate estimates of inbreeding and associated consequences on reproduction are crucial for predicting their future viability. However, few studies have partitioned this aspect of genetic viability with respect to reproduction in a group-living social mammal. We investigated the contributions of foundation stock lineages, putative fitness consequences of inbreeding, and genetic diversity of the breeding versus nonreproductive segment of the Yellowstone National Park gray wolf population. Our dataset spans 25 years and seven generations since reintroduction, encompassing 152 nuclear families and 329 litters. We found more than 87% of the pedigree foundation genomes persisted and report influxes of allelic diversity from two translocated wolves from a divergent source in Montana. As expected for group-living species, mean kinship significantly increased over time but with minimal loss of observed heterozygosity. Strikingly, the reproductive portion of the population carried a significantly lower genome-wide inbreeding coefficients, autozygosity, and more rapid decay for linkage disequilibrium relative to the nonbreeding population. Breeding wolves had significantly longer lifespans and lower inbreeding coefficients than nonbreeding wolves. Our model revealed that the number of litters was negatively significantly associated with heterozygosity (R = −0.11). Our findings highlight genetic contributions to fitness, and the importance of the reproductively active individuals in a population to counteract loss of genetic variation in a wild, free-ranging social carnivore. It is crucial for managers to mitigate factors that significantly reduce effective population size and genetic connectivity, which supports the dispersion of genetic variation that aids in rapid evolutionary responses to environmental challenges.

     
    more » « less
  5.  
    more » « less