DEPP: Deep Learning Enables Extending Species Trees using Single Genes

Jiang, Yueyu; Balaban, Metin; Zhu, Qiyun; Mirarab, Siavash; Solis-Lemus, ed., Claudia

doi:10.1093/sysbio/syac031

Citation Details

DEPP: Deep Learning Enables Extending Species Trees using Single Genes

Abstract Placing new sequences onto reference phylogenies is increasingly used for analyzing environmental samples, especially microbiomes. Existing placement methods assume that query sequences have evolved under specific models directly on the reference phylogeny. For example, they assume single-gene data (e.g., 16S rRNA amplicons) have evolved under the GTR model on a gene tree. Placement, however, often has a more ambitious goal: extending a (genome-wide) species tree given data from individual genes without knowing the evolutionary model. Addressing this challenging problem requires new directions. Here, we introduce Deep-learning Enabled Phylogenetic Placement (DEPP), an algorithm that learns to extend species trees using single genes without prespecified models. In simulations and on real data, we show that DEPP can match the accuracy of model-based methods without any prior knowledge of the model. We also show that DEPP can update the multilocus microbial tree-of-life with single genes with high accuracy. We further demonstrate that DEPP can combine 16S and metagenomic data onto a single tree, enabling community structure analyses that take advantage of both sources of data. [Deep learning; gene tree discordance; metagenomics; microbiome analyses; neural networks; phylogenetic placement.] more »

Award ID(s):: 1845967 2120019

PAR ID:: 10414551

Author(s) / Creator(s):: Jiang, Yueyu; Balaban, Metin; Zhu, Qiyun; Mirarab, Siavash; Solis-Lemus, ed., Claudia

Publisher / Repository:: Oxford University Press

Date Published:: 2022-04-29

Journal Name:: Systematic Biology

Volume:: 72

Issue:: 1

ISSN:: 1063-5157

Page Range / eLocation ID:: p. 17-34

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1093/sysbio/syac031

More Like this