Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations

Song, Kuncheng; Zhou, Yi-Hui

doi:10.3390/bioengineering10020231

Citation Details

Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations

The microbiota has proved to be one of the critical factors for many diseases, and researchers have been using microbiome data for disease prediction. However, models trained on one independent microbiome study may not be easily applicable to other independent studies due to the high level of variability in microbiome data. In this study, we developed a method for improving the generalizability and interpretability of machine learning models for predicting three different diseases (colorectal cancer, Crohn’s disease, and immunotherapy response) using nine independent microbiome datasets. Our method involves combining a smaller dataset with a larger dataset, and we found that using at least 25% of the target samples in the source data resulted in improved model performance. We determined random forest as our top model and employed feature selection to identify common and important taxa for disease prediction across the different studies. Our results suggest that this leveraging scheme is a promising approach for improving the accuracy and interpretability of machine learning models for predicting diseases based on microbiome data. more »

Award ID(s):: 2133504

PAR ID:: 10530605

Author(s) / Creator(s):: Song, Kuncheng; Zhou, Yi-Hui

Publisher / Repository:: MDPI

Date Published:: 2023-02-01

Journal Name:: Bioengineering

Volume:: 10

Issue:: 2

ISSN:: 2306-5354

Page Range / eLocation ID:: 231

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.3390/bioengineering10020231

More Like this