NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HPV-EM: an accurate HPV detection and genotyping EM algorithm

https://doi.org/10.1038/s41598-020-71300-7

Inkman, Matthew J.; Jayachandran, Kay; Ellis, Thomas M.; Ruiz, Fiona; McLellan, Michael D.; Miller, Christopher A.; Wu, Yufeng; Ojesina, Akinyemi I.; Schwarz, Julie K.; Zhang, Jin (August 2020, Scientific Reports)

Abstract Accurate HPV genotyping is crucial in facilitating epidemiology studies, vaccine trials, and HPV-related cancer research. Contemporary HPV genotyping assays only detect < 25% of all known HPV genotypes and are not accurate for low-risk or mixed HPV genotypes. Current genomic HPV genotyping algorithms use a simple read-alignment and filtering strategy that has difficulty handling repeats and homology sequences. Therefore, we have developed an optimized expectation–maximization algorithm, designated HPV-EM, to address the ambiguities caused by repetitive sequencing reads. HPV-EM achieved 97–100% accuracy when benchmarked using cell line data and TCGA cervical cancer data. We also validated HPV-EM using DNA tiling data on an institutional cervical cancer cohort (96.5% accuracy). Using HPV-EM, we demonstrated HPV genotypic differences in recurrence and patient outcomes in cervical and head and neck cancers.
more » « less
Joint inference of ancestry and genotypes of parents from children

https://doi.org/10.1016/j.isci.2022.104768

Zhang, Yiming; Wu, Yufeng (August 2022, iScience)

Full Text Available
Inferring the ancestry of parents and grandparents from genetic data

https://doi.org/10.1371/journal.pcbi.1008065

Pei, Jingwen; Zhang, Yiming; Nielsen, Rasmus; Wu, Yufeng (August 2020, PLOS Computational Biology)

Full Text Available
Inference of population admixture network from local gene genealogies: a coalescent-based maximum likelihood approach

https://doi.org/10.1093/bioinformatics/btaa465

Wu, Yufeng (July 2020, Bioinformatics)

Abstract Motivation Population admixture is an important subject in population genetics. Inferring population demographic history with admixture under the so-called admixture network model from population genetic data is an established problem in genetics. Existing admixture network inference approaches work with single genetic polymorphisms. While these methods are usually very fast, they do not fully utilize the information [e.g. linkage disequilibrium (LD)] contained in population genetic data. Results In this article, we develop a new admixture network inference method called GTmix. Different from existing methods, GTmix works with local gene genealogies that can be inferred from population haplotypes. Local gene genealogies represent the evolutionary history of sampled haplotypes and contain the LD information. GTmix performs coalescent-based maximum likelihood inference of admixture networks with inferred local genealogies based on the well-known multispecies coalescent (MSC) model. GTmix utilizes various techniques to speed up the likelihood computation on the MSC model and the optimal network search. Our simulations show that GTmix can infer more accurate admixture networks with much smaller data than existing methods, even when these existing methods are given much larger data. GTmix is reasonably efficient and can analyze population genetic datasets of current interests. Availability and implementation The program GTmix is available for download at: https://github.com/yufengwudcs/GTmix. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Detecting circular RNA from high-throughput sequence data with de Bruijn graph

https://doi.org/10.1186/s12864-019-6154-7

Li, Xin; Wu, Yufeng (March 2020, BMC Genomics)

Full Text Available
Accurate and efficient cell lineage tree inference from noisy single cell data: the maximum likelihood perfect phylogeny approach

https://doi.org/10.1093/bioinformatics/btz676

Wu, Yufeng (August 2019, Bioinformatics)
Schwartz, Russell (Ed.)
Abstract Motivation Cells in an organism share a common evolutionary history, called cell lineage tree. Cell lineage tree can be inferred from single cell genotypes at genomic variation sites. Cell lineage tree inference from noisy single cell data is a challenging computational problem. Most existing methods for cell lineage tree inference assume uniform uncertainty in genotypes. A key missing aspect is that real single cell data usually has non-uniform uncertainty in individual genotypes. Moreover, existing methods are often sampling based and can be very slow for large data. Results In this article, we propose a new method called ScisTree, which infers cell lineage tree and calls genotypes from noisy single cell genotype data. Different from most existing approaches, ScisTree works with genotype probabilities of individual genotypes (which can be computed by existing single cell genotype callers). ScisTree assumes the infinite sites model. Given uncertain genotypes with individualized probabilities, ScisTree implements a fast heuristic for inferring cell lineage tree and calling the genotypes that allow the so-called perfect phylogeny and maximize the likelihood of the genotypes. Through simulation, we show that ScisTree performs well on the accuracy of inferred trees, and is much more efficient than existing methods. The efficiency of ScisTree enables new applications including imputation of the so-called doublets. Availability and implementation The program ScisTree is available for download at: https://github.com/yufengwudcs/ScisTree. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
CLADES: A classification-based machine learning method for species delimitation from population genetic data

https://doi.org/10.1111/1755-0998.12887

Pei, Jingwen; Chu, Chong; Li, Xin; Lu, Bin; Wu, Yufeng (September 2018, Molecular Ecology Resources)

Search for: All records