GppFst: genomic posterior predictive simulations of FST and dXY for identifying outlier loci from population genomic data
- Award ID(s):
- 1655571
- PAR ID:
- 10054918
- Date Published:
- Journal Name:
- Bioinformatics
- Volume:
- 33
- Issue:
- 9
- ISSN:
- 1367-4803
- Page Range / eLocation ID:
- 1414-1415
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
In the past decade, advances in genome sequencing have allowed researchers to uncover the history of hybridization in diverse groups of species, including our own. Although the field has made impressive progress in documenting the extent of natural hybridization, both historical and recent, there are still many unanswered questions about its genetic and evolutionary consequences. Recent work has suggested that the outcomes of hybridization in the genome may be in part predictable, but many open questions about the nature of selection on hybrids and the biological variables that shape such selection have hampered progress in this area. We synthesize what is known about the mechanisms that drive changes in ancestry in the genome after hybridization, highlight major unresolved questions, and discuss their implications for the predictability of genome evolution after hybridization.more » « less
-
Abstract MotivationDatabase fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel’s law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks. ResultsVia experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP–phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP–phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy. Availability and implementationhttps://github.com/xiutianxi/robust-genomic-fp-github.more » « less
An official website of the United States government

