skip to main content

Title: Bayesian regression analysis of skewed tensor responses

Tensor regression analysis is finding vast emerging applications in a variety of clinical settings, including neuroimaging, genomics, and dental medicine. The motivation for this paper is a study of periodontal disease (PD) with an order‐3 tensor response: multiple biomarkers measured at prespecified tooth–sites within each tooth, for each participant. A careful investigation would reveal considerable skewness in the responses, in addition to response missingness. To mitigate the shortcomings of existing analysis tools, we propose a new Bayesian tensor response regression method that facilitates interpretation of covariate effects on both marginal and joint distributions of highly skewed tensor responses, and accommodates missing‐at‐random responses under a closure property of our tensor model. Furthermore, we present a prudent evaluation of the overall covariate effects while identifying their possible variations on only a sparse subset of the tensor components. Our method promises Markov chain Monte Carlo (MCMC) tools that are readily implementable. We illustrate substantial advantages of our proposal over existing methods via simulation studies and application to a real data set derived from a clinical study of PD. TheRpackageBSTNavailable inGitHubimplements our model.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Page Range / eLocation ID:
p. 1814-1825
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To link a clinical outcome with compositional predictors in microbiome analysis, the linear log‐contrast model is a popular choice, and the inference procedure for assessing the significance of each covariate is also available. However, with the existence of multiple potentially interrelated outcomes and the information of the taxonomic hierarchy of bacteria, a multivariate analysis method that considers the group structure of compositional covariates and an accompanying group inference method are still lacking. Motivated by a study for identifying the microbes in the gut microbiome of preterm infants that impact their later neurobehavioral outcomes, we formulate a constrained integrative multi‐view regression. The neurobehavioral scores form multivariate responses, the log‐transformed sub‐compositional microbiome data form multi‐view feature matrices, and a set of linear constraints on their corresponding sub‐coefficient matrices ensures the sub‐compositional nature. We assume all the sub‐coefficient matrices are possible of low‐rank to enable joint selection and inference of sub‐compositions/views. We propose a scaled composite nuclear norm penalization approach for model estimation and develop a hypothesis testing procedure through de‐biasing to assess the significance of different views. Simulation studies confirm the effectiveness of the proposed procedure. We apply the method to the preterm infant study, and the identified microbes are mostly consistent with existing studies and biological understandings.

    more » « less
  2. Abstract

    Over the past decade, there has been growing enthusiasm for using electronic medical records (EMRs) for biomedical research. Quantile regression estimates distributional associations, providing unique insights into the intricacies and heterogeneity of the EMR data. However, the widespread nonignorable missing observations in EMR often obscure the true associations and challenge its potential for robust biomedical discoveries. We propose a novel method to estimate the covariate effects in the presence of nonignorable missing responses under quantile regression. This method imposes no parametric specifications on response distributions, which subtly uses implicit distributions induced by the corresponding quantile regression models. We show that the proposed estimator is consistent and asymptotically normal. We also provide an efficient algorithm to obtain the proposed estimate and a randomly weighted bootstrap approach for statistical inferences. Numerical studies, including an empirical analysis of real‐world EMR data, are used to assess the proposed method's finite‐sample performance compared to existing literature.

    more » « less
  3. Like many other clinical and economic studies, each subject of our motivating transplant study is at risk of recurrent events of non-fatal tissue rejections as well as the terminating event of death due to total graft rejection. For such studies, our model and associated Bayesian analysis aim for some practical advantages over competing methods. Our semiparametric latent-class-based joint model has coherent interpretation of the covariate (including race and gender) effects on all functions and model quantities that are relevant for understanding the effects of covariates on future event trajectories. Our fully Bayesian method for estimation and prediction uses a complete specification of the prior process of the baseline functions. We also derive a practical and theoretically justifiable partial likelihood-based semiparametric Bayesian approach to deal with the analysis when there is a lack of prior information about baseline functions. Our model and method can accommodate fixed as well as time-varying covariates. Our Markov Chain Monte Carlo tools for both Bayesian methods are implementable via publicly available software. Our Bayesian analysis of transplant study and simulation study demonstrate practical advantages and improved performance of our approach. 
    more » « less
  4. Summary

    Food chain efficiency (FCE), the proportion of primary production converted to production of the top trophic level, can influence several ecosystem services as well as the biodiversity and productivity of each trophic level. AquaticFCEis affected by light and nutrient supply, largely via effects on primary producer stoichiometry that propagate to herbivores and then carnivores. Here, we test the hypothesis that the identity of the top carnivore mediatesFCEresponses to changes in light and nutrient supply.

    We conducted a large‐scale, 6‐week mesocosm experiment in which we manipulated light and nutrient (nitrogen and phosphorus) supply and the identity of the carnivore in a 2 × 2 × 2 factorial design. We quantified the response ofFCEand the biomass and productivity of each trophic level (phytoplankton, zooplankton, and carnivore). We used an invertebrate,Chaoborus americanus, and a vertebrate, bluegill sunfish (Lepomis macrochirus), as the two carnivores in this study because of the large difference in phosphorus requirements between these taxa.

    We predicted that bluegill would be more likely to experience P‐limitation due to higher P requirements, and hence thatFCEwould be lower in the bluegill treatments than in theChaoborustreatments. We also expected the interactive effect of light and nutrients to be stronger in the bluegill treatments. Within a carnivore treatment, we predicted highestFCEunder low light and high nutrient supply, as these conditions would produce high‐quality (low C:nutrient) algal resources. In contrast, if food quantity had a stronger effect on carnivore production than food quality, carnivore production would increase proportionally with primary production, thusFCEwould be similar across light and nutrient treatments.

    Carnivore identity mediated the effects of light and nutrients onFCE, and as predictedFCEwas higher in food chains withChaoborusthan with bluegill. Also as predicted,FCEinChaoborustreatments was higher under low light. However,FCEin bluegill treatments was higher at high light supply, opposite to our predictions. In addition, bluegill production increased proportionally with primary production, whileChaoborusproduction was not correlated with primary production, suggesting that bluegill responded more strongly to food quantity than to food quality. These carnivore taxa differ in traits other than body stoichiometry, for example, feeding selectivity, which may have contributed to the observed differences inFCEbetween carnivores.

    Comparison of our results with those from previous experiments showed thatFCEresponds similarly to light and nutrients in food chains withChaoborusand larval fish (gizzard shad: Clupeidae), but very differently in food chains with bluegill. These findings warrant further investigation into the mechanisms related to carnivore identity (e.g., developmental stage, feeding selectivity) underlying these responses, and highlight the importance of considering both top‐down and bottom‐up effects when evaluating food chain responses to changing light and nutrient conditions.

    more » « less
  5. Abstract

    With advances in biomedical research, biomarkers are becoming increasingly important prognostic factors for predicting overall survival, while the measurement of biomarkers is often censored due to instruments' lower limits of detection. This leads to two types of censoring: random censoring in overall survival outcomes and fixed censoring in biomarker covariates, posing new challenges in statistical modeling and inference. Existing methods for analyzing such data focus primarily on linear regression ignoring censored responses or semiparametric accelerated failure time models with covariates under detection limits (DL). In this paper, we propose a quantile regression for survival data with covariates subject to DL. Comparing to existing methods, the proposed approach provides a more versatile tool for modeling the distribution of survival outcomes by allowing covariate effects to vary across conditional quantiles of the survival time and requiring no parametric distribution assumptions for outcome data. To estimate the quantile process of regression coefficients, we develop a novel multiple imputation approach based on another quantile regression for covariates under DL, avoiding stringent parametric restrictions on censored covariates as often assumed in the literature. Under regularity conditions, we show that the estimation procedure yields uniformly consistent and asymptotically normal estimators. Simulation results demonstrate the satisfactory finite‐sample performance of the method. We also apply our method to the motivating data from a study of genetic and inflammatory markers of Sepsis.

    more » « less