skip to main content


Title: Exact variance component tests for longitudinal microbiome studies
Abstract

In metagenomic studies, testing the association between microbiome composition and clinical outcomes translates to testing the nullity of variance components. Motivated by a lung human immunodeficiency virus (HIV) microbiome project, we study longitudinal microbiome data by using variance component models with more than two variance components. Current testing strategies only apply to models with exactly two variance components and when sample sizes are large. Therefore, they are not applicable to longitudinal microbiome studies. In this paper, we propose exact tests (score test, likelihood ratio test, and restricted likelihood ratio test) to (a) test the association of the overall microbiome composition in a longitudinal design and (b) detect the association of one specific microbiome cluster while adjusting for the effects from related clusters. Our approach combines the exact tests for null hypothesis with a single variance component with a strategy of reducing multiple variance components to a single one. Simulation studies demonstrate that our method has a correct type I error rate and superior power compared to existing methods at small sample sizes and weak signals. Finally, we apply our method to a longitudinal pulmonary microbiome study of HIV‐infected patients and reveal two interesting generaPrevotellaandVeillonellaassociated with forced vital capacity. Our findings shed light on the impact of the lung microbiome on HIV complexities. The method is implemented in the open‐source, high‐performance computing languageJuliaand is freely available athttps://github.com/JingZhai63/VCmicrobiome.

 
more » « less
NSF-PAR ID:
10082999
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Genetic Epidemiology
Volume:
43
Issue:
3
ISSN:
0741-0395
Page Range / eLocation ID:
p. 250-262
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The tightest constraints on the tensor-to-scalar ratiorcan only be obtained after removing a substantial fraction of the lensingB-mode sample variance. The planned Cosmic Microwave Background (CMB)-S4 experiment (cmb-s4.org) will remove the lensingB-mode signal internally by reconstructing the gravitational lenses from high-resolution observations. We document here a first lensing reconstruction pipeline able to achieve this optimally for arbitrary sky coverage. We make it part of a map-based framework to test CMB-S4 delensing performance and its constraining power onr, including inhomogeneous noise and two non-Gaussian Galactic polarized foreground models. The framework performs component separation of the high-resolution maps, followed by the construction of lensingB-mode templates, which are then included in a parametric small-aperture map cross-spectra-based likelihood forr. We find that the lensing reconstruction and framework achieve the expected performance, compatible with the targetσ(r) ≃ 5 · 10−4in the absence of a tensor signal, after an effective removal of 92%–93% of the lensingB-mode variance, depending on the simulation set. The code for the lensing reconstruction can also be used for cross-correlation studies with large-scale structures, lensing spectrum reconstruction, cluster lensing, or other CMB lensing-related purposes. As part of our tests, we also demonstrate the joint optimal reconstruction of the lensing potential with the lensing curl potential mode at second order in the density fluctuations.

     
    more » « less
  2. Abstract

    The quantification of Hutchinson's n‐dimensional hypervolume has enabled substantial progress in community ecology, species niche analysis and beyond. However, most existing methods do not support a partitioning of the different components of hypervolume. Such a partitioning is crucial to address the ‘curse of dimensionality’ in hypervolume measures and interpret the metrics on the original niche axes instead of principal components. Here, we propose the use of multivariate normal distributions for the comparison of niche hypervolumes and introduce this as the multivariate‐normal hypervolume (MVNH) framework (R package available onhttps://github.com/lvmuyang/MVNH).

    The framework provides parametric measures of the size and dissimilarity of niche hypervolumes, each of which can be partitioned into biologically interpretable components. Specifically, the determinant of the covariance matrix (i.e. the generalized variance) of a MVNH is a measure of total niche size, which can be partitioned into univariate niche variance components and a correlation component (a measure of dimensionality, i.e. the effective number of independent niche axes standardized by the number of dimensions). The Bhattacharyya distance (BD; a function of the geometric mean of two probability distributions) between two MVNHs is a measure of niche dissimilarity. The BD partitions total dissimilarity into the components of Mahalanobis distance (standardized Euclidean distance with correlated variables) between hypervolume centroids and the determinant ratio which measures hypervolume size difference. The Mahalanobis distance and determinant ratio can be further partitioned into univariate divergences and a correlation component.

    We use empirical examples of community‐ and species‐level analysis to demonstrate the new insights provided by these metrics. We show that the newly proposed framework enables us to quantify the relative contributions of different hypervolume components and to connect these analyses to the ecological drivers of functional diversity and environmental niche variation.

    Our approach overcomes several operational and computational limitations of popular nonparametric methods and provides a partitioning framework that has wide implications for understanding functional diversity, niche evolution, niche shifts and expansion during biotic invasions, etc.

     
    more » « less
  3. Background

    Cognitive training may partially reverse cognitive deficits in people with HIV (PWH). Previous functional MRI (fMRI) studies demonstrate that working memory training (WMT) alters brain activity during working memory tasks, but its effects on resting brain network organization remain unknown.

    Purpose

    To test whether WMT affects PWH brain functional connectivity in resting‐state fMRI (rsfMRI).

    Study Type

    Prospective.

    Population

    A total of 53 PWH (ages 50.7 ± 1.5 years, two women) and 53HIV‐seronegative controls (SN, ages 49.5 ± 1.6 years, six women).

    Field Strength/Sequence

    Axial single‐shot gradient‐echo echo‐planar imaging at 3.0 T was performed at baseline (TL1), at 1‐month (TL2), and at 6‐months (TL3), after WMT.

    Assessment

    All participants had rsfMRI and clinical assessments (including neuropsychological tests) at TL1 before randomization to Cogmed WMT (adaptive training,n = 58: 28 PWH, 30 SN; nonadaptive training,n = 48: 25 PWH, 23 SN), 25 sessions over 5–8 weeks. All assessments were repeated at TL2 and at TL3. The functional connectivity estimated by independent component analysis (ICA) or graph theory (GT) metrics (eigenvector centrality, etc.) for different link densities (LDs) were compared between PWH and SN groups at TL1 and TL2.

    Statistical Tests

    Two‐way analyses of variance (ANOVA) on GT metrics and two‐samplet‐tests on FC or GT metrics were performed. Cognitive (eg memory) measures were correlated with eigenvector centrality (eCent) using Pearson's correlations. The significance level was set atP < 0.05 after false discovery rate correction.

    Results

    The ventral default mode network (vDMN) eCent differed between PWH and SN groups at TL1 but not at TL2 (P = 0.28). In PWH, vDMN eCent changes significantly correlated with changes in the memory ability in PWH (r = −0.62 at LD = 50%) and vDMN eCent before training significantly correlated with memory performance changes (r = 0.53 at LD = 50%).

    Data Conclusion

    ICA and GT analyses showed that adaptive WMT normalized graph properties of the vDMN in PWH.

    Evidence Level

    1

    Technical Efficacy

    1

     
    more » « less
  4. Abstract

    One of the Grand Challenges in Science is the construction of theTree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life is enormously computationally challenging, as all the current most accurate methods are either heuristics forNP-hard optimization problems or Bayesian MCMC methods that sample from tree space. One of the most promising approaches for improving scalability and accuracy for phylogeny estimation uses divide-and-conquer: a set of species is divided into overlapping subsets, trees are constructed on the subsets, and then merged together using a “supertree method”. Here, we present Exact-RFS-2, the first polynomial-time algorithm to find an optimal supertree of two trees, using the Robinson-Foulds Supertree (RFS) criterion (a major approach in supertree estimation that is related to maximum likelihood supertrees), and we prove that finding the RFS of three input trees isNP-hard. Exact-RFS-2 is available in open source form on Github athttps://github.com/yuxilin51/GreedyRFS.

     
    more » « less
  5. Summary Open Research Badges

    This article has earned an Open Data Badge for making publicly available the digitally‐shareable data necessary to reproduce the reported results. The data is available athttps://github.com/SNAnderson/maizeTE_variation;https://mcstitzer.github.io/maize_TEs.

     
    more » « less