Low level contamination confounds population genomic analysis

Ward, Audrey K; Scopel, Eduardo_F C; Shuman, Brent; Momany, Michelle; Bensasson, Douda

doi:10.1101/2025.01.17.633387

Citation Details

This content will become publicly available on January 22, 2026

Low level contamination confounds population genomic analysis

Abstract Genome sequence contamination has a variety of causes and can originate from within or between species. Previous research focused primarily on cross-species contamination or on prokaryotes. This paper visualizes B-allele frequency to test for intra-species contamination, and measures its effects on phylogenetic and admixture analysis in two fungal species. Using a standard base calling pipeline, we found that contaminated genomes superficially appeared to produce good quality genome data. Yet as little as 5-10% genome contamination was enough to change phylogenetic tree topologies and make contaminated strains appear as hybrids between lineages (genetically admixed). We recommend the use of B-allele frequency plots to screen genome resequencing data for intra-species contamination. more »

Award ID(s):: 1946046

PAR ID:: 10629996

Author(s) / Creator(s):: Ward, Audrey K; Scopel, Eduardo_F C; Shuman, Brent; Momany, Michelle; Bensasson, Douda

Publisher / Repository:: bioRxiv

Date Published:: 2025-01-22

Format(s):: Medium: X

Institution:: bioRxiv

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on January 22, 2026
Posted Content:
https://doi.org/10.1101/2025.01.17.633387

More Like this