skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An Exploratory Statistical Cusp Catastrophe Model
The Cusp Catastrophe Model provides a promising approach for health and behavioral researchers to investigate both continuous and quantum changes in one modeling framework. However, application of the model is hindered by unresolved issues around a statistical model fitting to the data. This paper reports our exploratory work in developing a new approach to statistical cusp catastrophe modeling. In this new approach, the Cusp Catastrophe Model is cast into a statistical nonlinear regression for parameter estimation. The algorithms of the delayed convention and Maxwell convention are applied to obtain parameter estimates using maximum likelihood estimation. Through a series of simulation studies, we demonstrate that (a) parameter estimation of this statistical cusp model is unbiased, and (b) use of a bootstrapping procedure enables efficient statistical inference. To test the utility of this new method, we analyze survey data collected for an NIH-funded project providing HIV-prevention education to adolescents in the Bahamas. We found that the results can be more reasonably explained by our approach than other existing methods. Additional research is needed to establish this new approach as the most reliable method for fitting the cusp catastrophe model. Further research should focus on additional theoretical analysis, extension of the model for analyzing categorical and counting data, and additional applications in analyzing different data types.  more » « less
Award ID(s):
1633212
PAR ID:
10039228
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2016 IEEE International Conference on Data Science and Advanced Analytics
Page Range / eLocation ID:
100 to 109
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Recent research applies soft computing techniques to fit software reliability growth models. However, runtime performance and the distribution of the distance from an optimal solution over multiple runs must be explicitly considered to justify the practical utility of these approaches, promote comparison, and support reproducible research. This paper presents a meta-optimization framework to design stable and efficient multi-phase algorithms for fitting software reliability growth models. The approach combines initial parameter estimation techniques from statistical algorithms, the global search properties of soft computing, and the rapid convergence of numerical methods. Designs that exhibit the best balance between runtime performance and accuracy are identified. The approach is illustrated through nonhomogeneous Poisson process and covariate software reliability growth models, including a cross-validation step on data sets not used to identify designs. The results indicate the nonhomogeneous Poisson process model considered is too simple to benefit from soft computing because it incurs additional runtime with no increase in accuracy attained. However, a multi-phase design for the covariate software reliability growth model consisting of the bat algorithm followed by a numerical method achieves better performance and converges consistently, compared to a numerical method only. The proposed approach supports higher dimensional covariate software reliability growth model fitting suitable for implementation in a tool. 
    more » « less
  2. The graphon (W-graph), including the stochastic block model as a special case, has been widely used in modeling and analyzing network data. Estimation of the graphon function has gained a lot of recent research interests. Most existing works focus on inference in the latent space of the model, while adopting simple maximum likelihood or Bayesian estimates for the graphon or connectivity parameters given the identified latent variables. In this work, we propose a hierarchical model and develop a novel empirical Bayes estimate of the connectivity matrix of a stochastic block model to approximate the graphon function. Based on our hierarchical model, we further introduce a new model selection criterion for choosing the number of communities. Numerical results on extensive simulations and two well-annotated social networks demonstrate the superiority of our approach in terms of parameter estimation and model selection. 
    more » « less
  3. SUMMARY Inverse problems play a central role in data analysis across the fields of science. Many techniques and algorithms provide parameter estimation including the best-fitting model and the parameters statistics. Here, we concern ourselves with the robustness of parameter estimation under constraints, with the focus on assimilation of noisy data with potential outliers, a situation all too familiar in Earth science, particularly in analysis of remote-sensing data. We assume a linear, or linearized, forward model relating the model parameters to multiple data sets with a priori unknown uncertainties that are left to be characterized. This is relevant for global navigation satellite system and synthetic aperture radar data that involve intricate processing for which uncertainty estimation is not available. The model is constrained by additional equalities and inequalities resulting from the physics of the problem, but the weights of equalities are unknown. We formulate the problem from a Bayesian perspective with non-informative priors. The posterior distribution of the model parameters, weights and outliers conditioned on the observations are then inferred via Gibbs sampling. We demonstrate the practical utility of the method based on a set of challenging inverse problems with both synthetic and real space-geodetic data associated with earthquakes and nuclear explosions. We provide the associated computer codes and expect the approach to be of practical interest for a wide range of applications. 
    more » « less
  4. The actual failure times of individual components are usually unavailable in many applications. Instead, only aggregate failure-time data are collected by actual users, due to technical and/or economic reasons. When dealing with such data for reliability estimation, practitioners often face the challenges of selecting the underlying failure-time distributions and the corresponding statistical inference methods. So far, only the exponential, normal, gamma and inverse Gaussian distributions have been used in analyzing aggregate failure-time data, due to these distributions having closed-form expressions for such data. However, the limited choices of probability distributions cannot satisfy extensive needs in a variety of engineering applications. PHase-type (PH) distributions are robust and flexible in modeling failure-time data, as they can mimic a large collection of probability distributions of non-negative random variables arbitrarily closely by adjusting the model structures. In this article, PH distributions are utilized, for the first time, in reliability estimation based on aggregate failure-time data. A Maximum Likelihood Estimation (MLE) method and a Bayesian alternative are developed. For the MLE method, an Expectation-Maximization algorithm is developed for parameter estimation, and the corresponding Fisher information is used to construct the confidence intervals for the quantities of interest. For the Bayesian method, a procedure for performing point and interval estimation is also introduced. Numerical examples show that the proposed PH-based reliability estimation methods are quite flexible and alleviate the burden of selecting a probability distribution when the underlying failure-time distribution is general or even unknown. 
    more » « less
  5. Abstract With advances in biomedical research, biomarkers are becoming increasingly important prognostic factors for predicting overall survival, while the measurement of biomarkers is often censored due to instruments' lower limits of detection. This leads to two types of censoring: random censoring in overall survival outcomes and fixed censoring in biomarker covariates, posing new challenges in statistical modeling and inference. Existing methods for analyzing such data focus primarily on linear regression ignoring censored responses or semiparametric accelerated failure time models with covariates under detection limits (DL). In this paper, we propose a quantile regression for survival data with covariates subject to DL. Comparing to existing methods, the proposed approach provides a more versatile tool for modeling the distribution of survival outcomes by allowing covariate effects to vary across conditional quantiles of the survival time and requiring no parametric distribution assumptions for outcome data. To estimate the quantile process of regression coefficients, we develop a novel multiple imputation approach based on another quantile regression for covariates under DL, avoiding stringent parametric restrictions on censored covariates as often assumed in the literature. Under regularity conditions, we show that the estimation procedure yields uniformly consistent and asymptotically normal estimators. Simulation results demonstrate the satisfactory finite‐sample performance of the method. We also apply our method to the motivating data from a study of genetic and inflammatory markers of Sepsis. 
    more » « less