Abstract This article proposes a new statistical model to infer interpretable population-level preferences from ordinal comparison data. Such data is ubiquitous, e.g., ranked choice votes, top-10 movie lists, and pairwise sports outcomes. Traditional statistical inference on ordinal comparison data results in an overall ranking of objects, e.g., from best to worst, with each object having a unique rank. However, the ranks of some objects may not be statistically distinguishable. This could happen due to insufficient data or to the true underlying object qualities being equal. Because uncertainty communication in estimates of overall rankings is notoriously difficult, we take a different approach and allow groups of objects to have equal ranks or berank-clusteredin our model. Existing models related to rank-clustering are limited by their inability to handle a variety of ordinal data types, to quantify uncertainty, or by the need to pre-specify the number and size of potential rank-clusters. We solve these limitations through our proposed BayesianRank-Clustered Bradley–Terry–Luce (BTL)model. We accommodate rank-clustering via parameter fusion by imposing a novel spike-and-slab prior on object-specific worth parameters in the BTL family of distributions for ordinal comparisons. We demonstrate rank-clustering on simulated and real datasets in surveys, elections, and sports analytics.
more »
« less
Predicting distributions of physical activity profiles in the National Health and Nutrition Examination Survey database using a partially linear Fréchet single index model
Summary Object-oriented data analysis is a fascinating and evolving field in modern statistical science, with the potential to make significant contributions to biomedical applications. This statistical framework facilitates the development of new methods to analyze complex data objects that capture more information than traditional clinical biomarkers. This paper applies the object-oriented framework to analyze physical activity levels, measured by accelerometers, as response objects in a regression model. Unlike traditional summary metrics, we utilize a recently proposed representation of physical activity data as a distributional object, providing a more nuanced and complete profile of individual energy expenditure across all ranges of monitoring intensity. A novel hybrid Fréchet regression model is proposed and applied to US population accelerometer data from National Health and Nutrition Examination Survey (NHANES) 2011 to 2014. The semi-parametric nature of the model allows for the inclusion of nonlinear effects for critical variables, such as age, which are biologically known to have subtle impacts on physical activity. Simultaneously, the inclusion of linear effects preserves interpretability for other variables, particularly categorical covariates such as ethnicity and sex. The results obtained are valuable from a public health perspective and could lead to new strategies for optimizing physical activity interventions in specific American subpopulations.
more »
« less
- Award ID(s):
- 2310943
- PAR ID:
- 10592739
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Biostatistics
- Volume:
- 26
- Issue:
- 1
- ISSN:
- 1468-4357
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background/Objective: Environmental exposures, such as heavy metals, can significantly affect physical activity, an important determinant of health. This study explores the effect of physical activity on combined exposure to cadmium, lead, and mercury (metals), using data from the 2013–2014 National Health and Nutrition Examination Survey (NHANES). Methods: Physical activity was measured with ActiGraph GT3X+ devices worn continuously for 7 days, while blood samples were analyzed for metal content using inductively coupled plasma mass spectrometry. Descriptive statistics and multivariable linear regression were used to assess the impact of multi-metal exposure on physical activity. Additionally, Bayesian Kernel Machine Regression (BKMR) was applied to explore nonlinear and interactive effects of metal exposures on physical activity. Using a Gaussian process with a radial basis function kernel, BKMR estimates posterior distributions via Markov Chain Monte Carlo (MCMC) sampling, allowing for robust evaluation of individual and combined exposure-response relationships. Posterior Inclusion Probabilities (PIPs) were calculated to quantify the relative importance of each metal. Results: The linear regression analysis revealed positive associations between cadmium and lead exposure and physical activity. BKMR analysis, particularly the PIP, identified lead as the most influential metal in predicting physical activity, followed by cadmium and mercury. These PIP values provide a probabilistic measure of each metal’s importance, offering deeper insights into their relative contributions to the overall exposure effect. The study also uncovered complex relationships between metal exposures and physical activity. In univariate BKMR exposure-response analysis, lead and cadmium generally showed positive associations with physical activity, while mercury exhibited a slightly negative relationship. Bivariate exposure-response analysis further illustrated how the impact of one metal could be influenced by the presence and levels of another, confirming the trends observed in univariate analyses while also demonstrating the complexity varying doses of two metals can have on either increased or decreased physical activity. Additionally, the overall exposure effect analysis across different quantiles revealed that higher levels of combined metal exposures were associated with increased physical activity, though there was greater uncertainty at higher exposure levels as the 95% credible intervals were wider. Conclusions: Overall, this study fills a critical gap by investigating the interactive and combined effects of multiple metals on physical activity. The findings underscore the necessity of using advanced methods such as BKMR to capture the complex dynamics of environmental exposures and their impact on human behavior and health outcomes.more » « less
-
This article extends recent work in magnetic manipulation of conductive, nonmagnetic objects using rotating magnetic dipole fields. Eddy-current-based manipulation provides a contact-free way to manipulate metallic objects. We are particularly motivated by the large amount of aluminum in space debris. We previously demonstrated dexterous manipulation of solid spheres with all object parameters known a priori. This work expands the previous model, which contained three discrete modes, to a continuous model that covers all possible relative positions of the manipulated spherical object with respect to the magnetic field source. We further leverage this new model to examine manipulation of spherical objects with unknown physical parameters by applying techniques from the online-optimization and adaptive-control literature. Our experimental results validate our new dynamics model, showing that we get improved performance compared to the previously proposed model, while also solving a simpler optimization problem for control. We further demonstrate the first physical magnetic manipulation of aluminum spheres, as previous controllers were only physically validated on copper spheres. We show that our adaptive control framework can quickly acquire useful object parameters when weakly initialized. Finally, we demonstrate that the spherical-object model can be used as an approximate model for adaptive control of nonspherical objects by performing magnetic manipulation of a variety of objects for which a spherical model is not an obvious approximation.more » « less
-
The Cusp Catastrophe Model provides a promising approach for health and behavioral researchers to investigate both continuous and quantum changes in one modeling framework. However, application of the model is hindered by unresolved issues around a statistical model fitting to the data. This paper reports our exploratory work in developing a new approach to statistical cusp catastrophe modeling. In this new approach, the Cusp Catastrophe Model is cast into a statistical nonlinear regression for parameter estimation. The algorithms of the delayed convention and Maxwell convention are applied to obtain parameter estimates using maximum likelihood estimation. Through a series of simulation studies, we demonstrate that (a) parameter estimation of this statistical cusp model is unbiased, and (b) use of a bootstrapping procedure enables efficient statistical inference. To test the utility of this new method, we analyze survey data collected for an NIH-funded project providing HIV-prevention education to adolescents in the Bahamas. We found that the results can be more reasonably explained by our approach than other existing methods. Additional research is needed to establish this new approach as the most reliable method for fitting the cusp catastrophe model. Further research should focus on additional theoretical analysis, extension of the model for analyzing categorical and counting data, and additional applications in analyzing different data types.more » « less
-
null (Ed.)We consider the best subset selection problem in linear regression—that is, finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We are primarily concerned with alternatives to cross-validation methods that do not require data partitioning and involve a range of information criteria extensively studied in the statistical literature. We show that the problem of interest can be modeled using fractional mixed-integer optimization, which can be tackled by leveraging recent advances in modern optimization solvers. The proposed algorithms involve solving a sequence of mixed-integer quadratic optimization problems (or their convexifications) and can be implemented with off-the-shelf solvers. We report encouraging results in our computational experiments, with respect to both the optimization and statistical performance. Summary of Contribution: This paper considers feature selection problems with information criteria. We show that by adopting a fractional optimization perspective (a well-known field in nonlinear optimization and operations research), it is possible to leverage recent advances in mixed-integer quadratic optimization technology to tackle traditional statistical problems long considered intractable. We present extensive computational experiments, with both synthetic and real data, illustrating that the new fractional optimization approach is orders of magnitude faster than existing approaches in the literature.more » « less
An official website of the United States government
