skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 1, 2025

Title: Flexible Bayesian modeling for longitudinal binary and ordinal responses
Abstract Longitudinal studies with binary or ordinal responses are widely encountered in various disciplines, where the primary focus is on the temporal evolution of the probability of each response category. Traditional approaches build from the generalized mixed effects modeling framework. Even amplified with nonparametric priors placed on the fixed or random effects, such models are restrictive due to the implied assumptions on the marginal expectation and covariance structure of the responses. We tackle the problem from a functional data analysis perspective, treating the observations for each subject as realizations from subject-specific stochastic processes at the measured times. We develop the methodology focusing initially on binary responses, for which we assume the stochastic processes have Binomial marginal distributions. Leveraging the logits representation, we model the discrete space processes through continuous space processes. We utilize a hierarchical framework to model the mean and covariance kernel of the continuous space processes nonparametrically and simultaneously through a Gaussian process prior and an Inverse-Wishart process prior, respectively. The prior structure results in flexible inference for the evolution and correlation of binary responses, while allowing for borrowing of strength across all subjects. The modeling approach can be naturally extended to ordinal responses. Here, the continuation-ratio logits factorization of the multinomial distribution is key for efficient modeling and inference, including a practical way of dealing with unbalanced longitudinal data. The methodology is illustrated with synthetic data examples and an analysis of college students’ mental health status data.  more » « less
Award ID(s):
1950902
PAR ID:
10569771
Author(s) / Creator(s):
;
Publisher / Repository:
Springer
Date Published:
Journal Name:
Statistics and Computing
Volume:
34
Issue:
6
ISSN:
0960-3174
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we study several profile estimation methods for the generalized semiparametric varying-coefficient additive model for longitudinal data by utilizing the within-subject correlations. The model is flexible in allowing timevarying effects for some covariates and constant effects for others, and in having the option to choose different link functions which can used to analyze both discrete and continuous longitudinal responses.We investigated the profile generalized estimating equation (GEE) approaches and the profile quadratic inference function (QIF) approach. The profile estimations are assisted with the local linear smoothing technique to estimate the time-varying effects. Several approaches that incorporate the within-subject correlations are investigated including the quasi-likelihood (QL), the minimum generalized variance (MGV), the quadratic inference function and the weighted least squares (WLS). The proposed estimation procedures can accommodate flexible sampling schemes. These methods provide a unified approach that work well for discrete longitudinal responses as well as for continuous longitudinal responses. Finite sample performances of these methods are examined through Monto Carlo simulations under various correlation structures for both discrete and continuous longitudinal responses. The simulation results show efficiency improvement over the working independence approach by utilizing the within-subject correlations as well as comparative performances of different approaches. 
    more » « less
  2. To monitor the dynamic behavior of degrading systems over time, a flexible hierarchical discrete-time state-space model (SSM) is introduced that can mathematically characterize the stochastic evolution of the latent states (discrete, continuous, or hybrid) of degrading systems, dynamic measurements collected from condition monitoring sources (e.g., sensors with mixed-type out-puts), and the failure process. This flexible SSM is inspired by Bayesian hierarchical modeling and recurrent neural networks without imposing prior knowledge regarding the stochastic structure of the system dynamics and its variables. The temporal behavior of degrading systems and the relationship between variables of the corresponding system dynamics are fully characterized by stochastic neural networks without having to define parametric relationships/distributions between deterministic and stochastic variables. A Bayesian filtering-based learning method is introduced to train the structure of the proposed framework with historical data. Also, the steps to utilize the proposed framework for inference and prediction of the latent states and sensor outputs are dis-cussed. Numerical experiments are provided to demonstrate the application of the proposed framework for degradation system modeling and monitoring. 
    more » « less
  3. The log‐Gaussian Cox process is a flexible and popular stochastic process for modeling point patterns exhibiting spatial and space‐time dependence. Model fitting requires approximation of stochastic integrals which is implemented through discretization over the domain of interest. With fine scale discretization, inference based on Markov chain Monte Carlo is computationally burdensome because of the cost of matrix decompositions and storage, such as the Cholesky, for high dimensional covariance matrices associated with latent Gaussian variables. This article addresses these computational bottlenecks by combining two recent developments: (i) a data augmentation strategy that has been proposed for space‐time Gaussian Cox processes that is based on exact Bayesian inference and does not require fine grid approximations for infinite dimensional integrals, and (ii) a recently developed family of sparsity‐inducing Gaussian processes, called nearest‐neighbor Gaussian processes, to avoid expensive matrix computations. Our inference is delivered within the fully model‐based Bayesian paradigm and does not sacrifice the richness of traditional log‐Gaussian Cox processes. We apply our method to crime event data in San Francisco and investigate the recovery of the intensity surface. 
    more » « less
  4. Many organisms leave evidence of their former occurrence, such as scat, abandoned burrows, middens, ancient eDNA or fossils, which indicate areas from which a species has since disappeared. However, combining this evidence with contemporary occurrences within a single modeling framework remains challenging. Traditional binary species‐distribution modeling reduces occurrence to two temporally coarse states (present/absent), so thus cannot leverage the information inherent in temporal sequences of evidence of past occurrence. In contrast, ordinal modeling can use the natural time‐varying order of states (e.g. never occupied versus previously occupied versus currently occupied) to provide greater insights into range shifts. We demonstrate the power of ordinal modeling for identifying the major influences of biogeographic and climatic variables on current and past occupancy of the American pikaOchotona princeps, a climate‐sensitive mammal. Sampling over five years across the species' southernmost, warm‐edge range limit, we tested the effects of these variables at 570 habitat patches where occurrence was classified either as binary or ordinal. The two analyses produced different top models and predictors – ordinal modeling highlighted chronic cold as the most‐important predictor of occurrence, whereas binary modeling indicated primacy of average summer‐long temperatures. Colder wintertime temperatures were associated in ordinal models with higher likelihood of occurrence, which we hypothesize reflect longer retention of insulative and meltwater‐provisioning snowpacks. Our binary results mirrored those of other past pika investigations employing binary analysis, wherein warmer temperatures decrease likelihood of occurrence. Because both ordinal‐ and binary‐analysis top models included climatic and biogeographic factors, results constitute important considerations for climate‐adaptation planning. Cross‐time evidences of species occurrences remain underutilized for assessing responses to climate change. Compared to multi‐state occupancy modeling, which presumes all states occur in the same time period, ordinal models enable use of historical evidence of species' occurrence to identify factors driving species' distributions more finely across time. 
    more » « less
  5. null (Ed.)
    Gaussian processes offer an attractive framework for predictive modeling from longitudinal data, i.e., irregularly sampled, sparse observations from a set of individuals over time. However, such methods have two key shortcomings: (i) They rely on ad hoc heuristics or expensive trial and error to choose the effective kernels, and (ii) They fail to handle multilevel correlation structure in the data. We introduce Longitudinal deep kernel Gaussian process regression (L-DKGPR) to overcome these limitations by fully automating the discovery of complex multilevel correlation structure from longitudinal data. Specifically, L-DKGPR eliminates the need for ad hoc heuristics or trial and error using a novel adaptation of deep kernel learning that combines the expressive power of deep neural networks with the flexibility of non-parametric kernel methods. L-DKGPR effectively learns the multilevel correlation with a novel additive kernel that simultaneously accommodates both time-varying and the time-invariant effects. We derive an efficient algorithm to train L-DKGPR using latent space inducing points and variational inference. Results of extensive experiments on several benchmark data sets demonstrate that L-DKGPR significantly outperforms the state-of-the-art longitudinal data analysis (LDA) methods. 
    more » « less