NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Estimation of Over-parameterized Models from an Auto-Modeling Perspective

https://doi.org/10.1080/01621459.2025.2455192

Jiang, Yiran; Liu, Chuanhai (January 2025, Journal of the American Statistical Association)

Free, publicly-accessible full text available January 31, 2026
The typicality principle and its implications for statistics and data science

Jiang, Yiran; Zhang, Zeyu; Martin, Ryan; Liu, Chuanhai (January 2025, arXiv.org)

A central focus of data science is the transformation of empirical evidence into knowledge. As such, the key insights and scientific attitudes of deep thinkers like Fisher, Popper, and Tukey are expected to inspire exciting new advances in machine learning and artificial intelligence in years to come. Along these lines, the present paper advances a novel {\em typicality principle} which states, roughly, that if the observed data is sufficiently ``atypical'' in a certain sense relative to a posited theory, then that theory is unwarranted. This emphasis on typicality brings familiar but often overlooked background notions like model-checking to the inferential foreground. One instantiation of the typicality principle is in the context of parameter estimation, where we propose a new typicality-based regularization strategy that leans heavily on goodness-of-fit testing. The effectiveness of this new regularization strategy is illustrated in three non-trivial examples where ordinary maximum likelihood estimation fails miserably. We also demonstrate how the typicality principle fits within a bigger picture of reliable and efficient uncertainty quantification.
more » « less
Free, publicly-accessible full text available January 24, 2026
Towards Strong AI: Transformational Beliefs and Scientific Creativity

Eschker, Samuel J; Liu, Chuanha (December 2024, arXiv.org)

Strong artificial intelligence (AI) is envisioned to possess general cognitive abilities and scientific creativity comparable to human intelligence, encompassing both knowledge acquisition and problem-solving. While remarkable progress has been made in weak AI, the realization of strong AI remains a topic of intense debate and critical examination. In this paper, we explore pivotal innovations in the history of astronomy and physics, focusing on the discovery of Neptune and the concept of scientific revolutions as perceived by philosophers of science. Building on these insights, we introduce a simple theoretical and statistical framework of weak beliefs, termed the Transformational Belief (TB) framework, designed as a foundation for modeling scientific creativity. Through selected illustrative examples in statistical science, we demonstrate the TB framework's potential as a promising foundation for understanding, analyzing, and even fostering creativity -- paving the way toward the development of strong AI. We conclude with reflections on future research directions and potential advancements.
more » « less
Full Text Available
Finite Sample Valid Inference via Calibrated Bootstrap

Jiang, Yiran; Liu, Chuanhai; Zhang, Heping (August 2024, arXiv.org)

While widely used as a general method for uncertainty quantification, the bootstrap method encounters difficulties that raise concerns about its validity in practical applications. This paper introduces a new resampling-based method, termed calibrated bootstrap, designed to generate finite sample-valid parametric inference from a sample of size n. The central idea is to calibrate an m-out-of-n resampling scheme, where the calibration parameter m is determined against inferential pivotal quantities derived from the cumulative distribution functions of loss functions in parameter estimation. The method comprises two algorithms. The first, named resampling approximation (RA), employs a stochastic approximation algorithm to find the value of the calibration parameter m=mα for a given α in a manner that ensures the resulting m-out-of-n bootstrapped 1−α confidence set is valid. The second algorithm, termed distributional resampling (DR), is developed to further select samples of bootstrapped estimates from the RA step when constructing 1−α confidence sets for a range of α values is of interest. The proposed method is illustrated and compared to existing methods using linear regression with and without L1 penalty, within the context of a high-dimensional setting and a real-world data application. The paper concludes with remarks on a few open problems worthy of consideration.
more » « less
Full Text Available

Search for: All records