Sparsification of large ultrametric matrices: insights into the microbial Tree of Life

Gorman, Evan; Lladser, Manuel E.

doi:10.1098/rspa.2022.0847

Citation Details

This content will become publicly available on September 1, 2024

Sparsification of large ultrametric matrices: insights into the microbial Tree of Life

Ultrametric matrices appear in many domains of mathematics and science; nevertheless, they can be large and dense, making them difficult to store and manipulate, unlike large but sparse matrices. In this manuscript, we exploit that ultrametric matrices can be represented as binary trees to sparsify them via an orthonormal base change based on Haar-like wavelets. We show that, with overwhelmingly high probability, only an asymptotically negligible fraction of the off-diagonal entries in random but large ultrametric matrices remain non-zero after the base change; and develop an algorithm to sparsify such matrices directly from their tree representation. We also identify the subclass of matrices diagonalized by the Haar-like wavelets and supply a sufficient condition to approximate the spectrum of ultrametric matrices outside this subclass. Our methods give computational access to a covariance matrix model of the microbiologists’ Tree of Life, which was previously inaccessible due to its size, and motivate introducing a new wavelet-based (beta-diversity) metric to compare microbial environments. Unlike the established metrics, the new metric may be used to identify internal nodes (i.e. splits) in the Tree that link microbial composition and environmental factors in a statistically significant manner. more »

Award ID(s):: 1836914

NSF-PAR ID:: 10495519

Author(s) / Creator(s):: Gorman, Evan; Lladser, Manuel E.

Publisher / Repository:: The Royal Society Publishing

Date Published:: 2023-09-01

Journal Name:: Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

Volume:: 479

Issue:: 2277

ISSN:: 1364-5021

Subject(s) / Keyword(s):: ["double principal coordinate analysis","Haar-like wavelets","sparsification","phylogenetic covariance matrix","ultrametric","UniFrac"]

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on September 1, 2024
Journal Article:
https://doi.org/10.1098/rspa.2022.0847

More Like this