skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Robust Bayesian growth curve modelling using conditional medians
Growth curve models have been widely used to analyse longitudinal data in social and behavioural sciences. Although growth curve models with normality assumptions are relatively easy to estimate, practical data are rarely normal. Failing to account for non‐normal data may lead to unreliable model estimation and misleading statistical inference. In this work, we propose a robust approach for growth curve modelling using conditional medians that are less sensitive to outlying observations. Bayesian methods are applied for model estimation and inference. Based on the existing work on Bayesian quantile regression using asymmetric Laplace distributions, we use asymmetric Laplace distributions to convert the problem of estimating a median growth curve model into a problem of obtaining the maximum likelihood estimator for a transformed model. Monte Carlo simulation studies have been conducted to evaluate the numerical performance of the proposed approach with data containing outliers or leverage observations. The results show that the proposed approach yields more accurate and efficient parameter estimates than traditional growth curve modelling. We illustrate the application of our robust approach using conditional medians based on a real data set from the Virginia Cognitive Aging Project.  more » « less
Award ID(s):
1951038
PAR ID:
10452749
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
British Journal of Mathematical and Statistical Psychology
Volume:
74
Issue:
2
ISSN:
0007-1102
Format(s):
Medium: X Size: p. 286-312
Size(s):
p. 286-312
Sponsoring Org:
National Science Foundation
More Like this
  1. Growth curve models (GCMs), with their ability to directly investigate within-subject change over time and between-subject differences in change for longitudinal data, are widely used in social and behavioral sciences. While GCMs are typically studied with the normal distribution assumption, empirical data often violate the normality assumption in applications. Failure to account for the deviation from normality in data distribution may lead to unreliable model estimation and misleading statistical inferences. A robust GCM based on conditional medians was recently proposed and outperformed traditional growth curve modeling when outliers are present resulting in nonnormality. However, this robust approach was shown to perform less satisfactorily when leverage observations existed. In this work, we propose a robust double medians growth curve modeling approach (DOME GCM) to thoroughly disentangle the influence of data contamination on model estimation and inferences, where two conditional medians are employed for the distributions of the within-subject measurement errors and of random effects, respectively. Model estimation and inferences are conducted in the Bayesian framework, and Laplace distributions are used to convert the optimization problem of median estimation into a problem of obtaining the maximum likelihood estimator for a transformed model. A Monte Carlo simulation study has been conducted to evaluate the numerical performance of the proposed approach, and showed that the proposed approach yields more accurate and efficient parameter estimates when data contain outliers or leverage observations. The application of the developed robust approach is illustrated using a real dataset from the Virginia Cognitive Aging Project to study the change of memory ability. 
    more » « less
  2. We combine weighting and Bayesian prediction in a unified approach to survey inference. The general principles of Bayesian analysis imply that models for survey outcomes should be conditional on all variables that affect the probability of inclusion. We incorporate all the variables that are used in the weighting adjustment under the framework of multilevel regression and poststratification, as a byproduct generating model-based weights after smoothing. We improve small area estimation by dealing with different complex issues caused by real-life applications to obtain robust inference at finer levels for subdomains of interest. We investigate deep interactions and introduce structured prior distributions for smoothing and stability of estimates. The computation is done via Stan and is implemented in the open-source R package rstanarm and available for public use. We evaluate the design-based properties of the Bayesian procedure. Simulation studies illustrate how the model-based prediction and weighting inference can outperform classical weighting. We apply the method to the New York Longitudinal Study of Wellbeing. The new approach generates smoothed weights and increases efficiency for robust finite population inference, especially for subsets of the population. 
    more » « less
  3. null (Ed.)
    Growth mixture modeling is a popular analytic tool for longitudinal data analysis. It detects latent groups based on the shapes of growth trajectories. Traditional growth mixture modeling assumes that outcome variables are normally distributed within each class. When data violate this normality assumption, however, it is well documented that the traditional growth mixture modeling mislead researchers in determining the number of latent classes as well as in estimating parameters. To address nonnormal data in growth mixture modeling, robust methods based on various nonnormal distributions have been developed. As a new robust approach, growth mixture modeling based on conditional medians has been proposed. In this article, we present the results of two simulation studies that evaluate the performance of the median-based growth mixture modeling in identifying the correct number of latent classes when data follow the normality assumption or have outliers. We also compared the performance of the median-based growth mixture modeling to the performance of traditional growth mixture modeling as well as robust growth mixture modeling based on t distributions. For identifying the number of latent classes in growth mixture modeling, the following three Bayesian model comparison criteria were considered: deviance information criterion, Watanabe-Akaike information criterion, and leave-one-out cross validation. For the median-based growth mixture modeling and t -based growth mixture modeling, our results showed that they maintained quite high model selection accuracy across all conditions in this study (ranged from 87 to 100%). In the traditional growth mixture modeling, however, the model selection accuracy was greatly influenced by the proportion of outliers. When sample size was 500 and the proportion of outliers was 0.05, the correct model was preferred in about 90% of the replications, but the percentage dropped to about 40% as the proportion of outliers increased to 0.15. 
    more » « less
  4. The actual failure times of individual components are usually unavailable in many applications. Instead, only aggregate failure-time data are collected by actual users, due to technical and/or economic reasons. When dealing with such data for reliability estimation, practitioners often face the challenges of selecting the underlying failure-time distributions and the corresponding statistical inference methods. So far, only the exponential, normal, gamma and inverse Gaussian distributions have been used in analyzing aggregate failure-time data, due to these distributions having closed-form expressions for such data. However, the limited choices of probability distributions cannot satisfy extensive needs in a variety of engineering applications. PHase-type (PH) distributions are robust and flexible in modeling failure-time data, as they can mimic a large collection of probability distributions of non-negative random variables arbitrarily closely by adjusting the model structures. In this article, PH distributions are utilized, for the first time, in reliability estimation based on aggregate failure-time data. A Maximum Likelihood Estimation (MLE) method and a Bayesian alternative are developed. For the MLE method, an Expectation-Maximization algorithm is developed for parameter estimation, and the corresponding Fisher information is used to construct the confidence intervals for the quantities of interest. For the Bayesian method, a procedure for performing point and interval estimation is also introduced. Numerical examples show that the proposed PH-based reliability estimation methods are quite flexible and alleviate the burden of selecting a probability distribution when the underlying failure-time distribution is general or even unknown. 
    more » « less
  5. Summary Conditional density estimation seeks to model the distribution of a response variable conditional on covariates. We propose a Bayesian partition model using logistic Gaussian processes to perform conditional density estimation. The partition takes the form of a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm. The methodology models data in which the density changes sharply throughout the covariate space, and can be used to determine where important changes in the density occur. The Markov chain Monte Carlo algorithm involves a Laplace approximation on the latent variables of the logistic Gaussian process model which marginalizes the parameters in each partition element, allowing an efficient search of the approximate posterior distribution of the tessellation. The method is consistent when the density is piecewise constant in the covariate space or when the density is Lipschitz continuous with respect to the covariates. In simulation and application to wind turbine data, the model successfully estimates the partition structure and conditional distribution. 
    more » « less