Multidimensional scaling improves distance-based clustering for microbiome data

Chen, Guanhua (ORCID:0000000293142037); Wang, Xinyue; Sun, Qiang; Tang, Zheng-Zheng

doi:10.1093/bioinformatics/btaf042

Citation Details

This content will become publicly available on February 1, 2026

Multidimensional scaling improves distance-based clustering for microbiome data

Abstract Motivation:Clustering patients into subgroups based on their microbial compositions can greatly enhance our understanding of the role of microbes in human health and disease etiology. Distance-based clustering methods, such as partitioning around medoids (PAM), are popular due to their computational efficiency and absence of distributional assumptions. However, the performance of these methods can be suboptimal when true cluster memberships are driven by differences in the abundance of only a few microbes, a situation known as the sparse signal scenario. Results:We demonstrate that classical multidimensional scaling (MDS), a widely used dimensionality reduction technique, effectively denoises microbiome data and enhances the clustering performance of distance-based methods. We propose a two-step procedure that first applies MDS to project high-dimensional microbiome data into a low-dimensional space, followed by distance-based clustering using the low-dimensional data. Our extensive simulations demonstrate that our procedure offers superior performance compared to directly conducting distance-based clustering under the sparse signal scenario. The advantage of our procedure is further showcased in several real data applications. Availability and implementation:The R package MDSMClust is available at https://github.com/wxy929/MDS-project. more »

Award ID(s):: 2054346

PAR ID:: 10651770

Author(s) / Creator(s):: Chen, Guanhua; Wang, Xinyue; Sun, Qiang; Tang, Zheng-Zheng

Editor(s):: Birol, Inanc

Publisher / Repository:: Oxford

Date Published:: 2025-02-01

Journal Name:: Bioinformatics

Volume:: 41

Issue:: 2

ISSN:: 1367-4811

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on February 1, 2026
Journal Article:
https://doi.org/10.1093/bioinformatics/btaf042

More Like this