NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

tMHG-Finder: Tree-Guided Maximal Homologous Group Finder for Bacterial Genomes

https://doi.org/10.1007/978-3-031-94928-9_6

Yin, Yongze; Kille, Bryce; Ogilvie, Huw A; Treangen, Todd J; Nakhleh, Luay (September 2025, Springer Nature Switzerland)

Free, publicly-accessible full text available September 1, 2026
“Correcting” Gene Trees to be More Like Species Trees Frequently Increases Topological Error

https://doi.org/10.1093/gbe/evad094

Yan, Zhi; Ogilvie, Huw A; Nakhleh, Luay (June 2023, Genome Biology and Evolution)
Holland, Barbara (Ed.)
Abstract The evolutionary histories of individual loci in a genome can be estimated independently, but this approach is error-prone due to the limited amount of sequence data available for each gene, which has led to the development of a diverse array of gene tree error correction methods which reduce the distance to the species tree. We investigate the performance of two representatives of these methods: TRACTION and TreeFix. We found that gene tree error correction frequently increases the level of error in gene tree topologies by “correcting” them to be closer to the species tree, even when the true gene and species trees are discordant. We confirm that full Bayesian inference of the gene trees under the multispecies coalescent model is more accurate than independent inference. Future gene tree correction approaches and methods should incorporate an adequately realistic model of evolution instead of relying on oversimplified heuristics.
more » « less
Full Text Available
Comparing inference under the multispecies coalescent with and without recombination

https://doi.org/10.1016/j.ympev.2023.107724

Yan, Zhi; Ogilvie, Huw A.; Nakhleh, Luay (April 2023, Molecular Phylogenetics and Evolution)

Full Text Available
Annotation-free delineation of prokaryotic homology groups

https://doi.org/10.1371/journal.pcbi.1010216

Yin, Yongze; Ogilvie, Huw A.; Nakhleh, Luay (June 2022, PLOS Computational Biology)
Kolodny, Rachel (Ed.)
Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences ( MHGs ) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.
more » « less
Full Text Available
Variational inference using approximate likelihood under the coalescent with recombination

https://doi.org/10.1101/gr.273631.120

Liu, Xinhao; Ogilvie, Huw A.; Nakhleh, Luay (November 2021, Genome Research)

Coalescent methods are proven and powerful tools for population genetics, phylogenetics, epidemiology, and other fields. A promising avenue for the analysis of large genomic alignments, which are increasingly common, is coalescent hidden Markov model (coalHMM) methods, but these methods have lacked general usability and flexibility. We introduce a novel method for automatically learning a coalHMM and inferring the posterior distributions of evolutionary parameters using black-box variational inference, with the transition rates between local genealogies derived empirically by simulation. This derivation enables our method to work directly with three or four taxa and through a divide-and-conquer approach with more taxa. Using a simulated data set resembling a human–chimp–gorilla scenario, we show that our method has comparable or better accuracy to previous coalHMM methods. Both species divergence times and population sizes were accurately inferred. The method also infers local genealogies, and we report on their accuracy. Furthermore, we discuss a potential direction for scaling the method to larger data sets through a divide-and-conquer approach. This accuracy means our method is useful now, and by deriving transition rates by simulation, it is flexible enough to enable future implementations of various population models.
more » « less
Full Text Available

Search for: All records