skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Properties and unbiased estimation of F - and D -statistics in samples containing related and inbred individuals
Abstract The Patterson F- and D-statistics are commonly used measures for quantifying population relationships and for testing hypotheses about demographic history. These statistics make use of allele frequency information across populations to infer different aspects of population history, such as population structure and introgression events. Inclusion of related or inbred individuals can bias such statistics, which may often lead to the filtering of such individuals. Here, we derive statistical properties of the F- and D-statistics, including their biases due to the inclusion of related or inbred individuals, their variances, and their corresponding mean squared errors. Moreover, for those statistics that are biased, we develop unbiased estimators and evaluate the variances of these new quantities. Comparisons of the new unbiased statistics to the originals demonstrates that our newly derived statistics often have lower error across a wide population parameter space. Furthermore, we apply these unbiased estimators using several global human populations with the inclusion of related individuals to highlight their application on an empirical dataset. Finally, we implement these unbiased estimators in open-source software package funbiased for easy application by the scientific community.  more » « less
Award ID(s):
1949268 2001063 1925825
PAR ID:
10361215
Author(s) / Creator(s):
 ;  ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Genetics
Volume:
220
Issue:
1
ISSN:
1943-2631
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Rokas, A (Ed.)
    Abstract The gray short-tailed opossum (Monodelphis domestica) is an established laboratory-bred marsupial model for biomedical research. It is a critical species for comparative genomics research, providing the pivotal phylogenetic outgroup for studies of derived vs ancestral states of genomic/epigenomic characteristics for eutherian mammal lineages. To characterize the current genetic profile of this laboratory marsupial, we examined 79 individuals from eight established laboratory strains. Double digest restriction site-associated DNA sequencing and whole-genome resequencing experiments were performed to investigate the genetic architecture in these strains. A total of 66,640 high-quality single nucleotide polymorphisms (SNPs) were identified. We analyzed SNP density, average heterozygosity, nucleotide diversity, and population differentiation parameter Fst within and between the eight strains. Principal component and population structure analysis clearly resolve the strains at the level of their ancestral founder populations, and the genetic architecture of these strains correctly reflects their breeding history. We confirmed the successful establishment of the first inbred laboratory opossum strain LSD (inbreeding coefficient F > 0.99) and a nearly inbred strain FD2M1 (0.98 < F < 0.99), each derived from a different ancestral background. These strains are suitable for various experimental protocols requiring controlled genetic backgrounds and for intercrosses and backcrosses that can generate offspring with informative SNPs for studying a variety of genetic and epigenetic processes. Together with recent advances in reproductive manipulation and CRISPR/Cas9 techniques for Monodelphis domestica, the existence of distinctive inbred strains will enable genome editing on different genetic backgrounds, greatly expanding the utility of this marsupial model for biomedical research. 
    more » « less
  2. This paper presents finite‐sample efficiency bounds for the core econometric problem of estimation of linear regression coefficients. We show that the classical Gauss–Markov theorem can be restated omitting the unnatural restriction to linear estimators, without adding any extra conditions. Our results are lower bounds on the variances of unbiased estimators. These lower bounds correspond to the variances of the the least squares estimator and the generalized least squares estimator, depending on the assumption on the error covariances. These results show that we can drop the label “linear estimator” from the pedagogy of the Gauss–Markov theorem. Instead of referring to these estimators as BLUE, they can legitimately be called BUE (best unbiased estimators). 
    more » « less
  3. null (Ed.)
    Species that went extinct prior to the genomic era are typically out-of-reach for modern phylogenetic studies. We refer to these as “Alexandrian” extinctions, after the lost library of the ancient world. This is particularly limiting for conservation studies, as genetic data for such taxa may be key to understand extinction threats and risks, the causes of declines, and inform management of related, extant populations. Fortunately, continual advances in biochemistry and DNA sequencing offer increasing ability to recover DNA from historical museum specimens, including fluid-preserved natural history collections. Here, we report on success in recovering nuclear and mitochondrial data from the apparently-extinct subspecies Desmognathus fuscus carri Neill, 1951, a plethodontid salamander from spring runs in central Florida. The two specimens are 50 years old and were likely preserved in unbuffered formalin, but application of a recently derived extraction procedure yielded usable DNA and partially successful Anchored Hybrid Enrichment sequencing. These data suggest that the populations of D. f. carri from peninsular Florida are conspecific with the D. auriculatus A lineage as suggested by previous authors, but likely represented an ecogeographically distinct genetic segment that has now been lost. Genetic data from this Alexandrian extinction thus confirm the geographic extent of population declines and extirpations as well as their ecological context, suggesting a possibly disproportionate loss from sandy-bottom clearwater streams compared to blackwater swamps. Success of these methods bodes well for large-scale application to fluid-preserved natural history specimens from relevant historical populations, but the possibility of significant DNA damage and related sequencing errors in additional hurdle to overcome. 
    more » « less
  4. Bun, Mark (Ed.)
    Given a differentially private unbiased estimate q̃ = q(D) +ν of a statistic q(D), we wish to obtain unbiased estimates of functions of q(D), such as 1/q(D), solely through post-processing of q̃, with no further access to the confidential dataset D. To this end, we adapt the deconvolution method used for unbiased estimation in the statistical literature, deriving unbiased estimators for a broad family of twice-differentiable functions - those that are tempered distributions - when the privacy-preserving noise ν is drawn from the Laplace distribution (Dwork et al., 2006). We further extend this technique to functions other than tempered distributions, deriving approximately optimal estimators that are unbiased for values in a user-specified interval (possibly extending to ± ∞). We use these results to derive an unbiased estimator for private means when the size n of the dataset is not publicly known. In a numerical application, we find that a mechanism that uses our estimator to return an unbiased sample size and mean outperforms a mechanism that instead uses the previously known unbiased privacy mechanism for such means (Kamath et al., 2023). We also apply our estimators to develop unbiased transformation mechanisms for per-record differential privacy, a privacy concept in which the privacy guarantee is a public function of a record’s value (Seeman et al., 2024). Our mechanisms provide stronger privacy guarantees than those in prior work (Finley et al., 2024) by using Laplace, rather than Gaussian, noise. Finally, using a different approach, we go beyond Laplace noise by deriving unbiased estimators for polynomials under the weak condition that the noise distribution has sufficiently many moments. 
    more » « less
  5. ABSTRACT Many insects inhabiting temperate climates are faced with changing environmental conditions throughout the year. Depending on the species, these environmental fluctuations can be experienced within a single generation or across multiple generations. Strategies for dealing with these seasonal changes vary across populations. Drosophila mojavensis is a cactophilic Drosophila species endemic to the Sonoran Desert. The Sonoran Desert regularly reaches temperatures of 50°C in the summer months. As individuals of this population are rare to collect in the summer months, we simulated the cycling temperatures experienced by D. mojavensis in the Sonoran Desert from April to July (four generations) in a temperature- and light-controlled chamber, to understand the physiological and life history changes that allow this population to withstand these conditions. In contrast to our hypothesis of a summer aestivation, we found that D. mojavensis continue to reproduce during the summer months, albeit with lower viability, but the adult survivorship of the population is highly reduced during this period. As expected, stress resistance increased during the summer months in both the adult and the larval stages. This study examines several strategies for withstanding the Sonoran Desert summer conditions which may be informative in the study of other desert endemic species. 
    more » « less