NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multi-Dimensional Domain Generalization with Low-Rank Structures

https://doi.org/10.1080/01621459.2025.2471055

Li, Sai; Zhang, Linjun (April 2025, Journal of the American Statistical Association)

In conventional statistical and machine learning methods, it is typically assumed that the test data are identically distributed with the training data. However, this assumption does not always hold, especially in applications where the target population are not well-represented in the training data. This is a notable issue in health-related studies, where specific ethnic populations may be underrepresented, posing a significant challenge for researchers aiming to make statistical inferences about these minority groups. In this work, we present a novel approach to addressing this challenge in linear regression models. We organize the model parameters for all the sub-populations into a tensor. By studying a structured tensor completion problem, we can achieve robust domain generalization, that is, learning about sub-populations with limited or no available data. Our method novelly leverages the structure of group labels and it can produce more reliable and interpretable generalization results. We establish rigorous theoretical guarantees for the proposed method and demonstrate its minimax optimality. To validate the effectiveness of our approach, we conduct extensive numerical experiments and a real data study focused on diabetes prediction for multiple subgroups, comparing our results with those obtained using other existing methods. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
more » « less
Free, publicly-accessible full text available April 11, 2026
Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Liu, Tianci; Li, Ruirui; Dong, Zihan; Liu, Hui; Tang, Xianfeng; Yin, Qingyu; Zhang, Linjun; Wang, Haoyu; Gao, Jing (July 2025, Proceedings of the International Conference on Machine Learning)

Free, publicly-accessible full text available July 13, 2026
Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Liu, Tianci; Li, Ruirui; Dong, Zihan; Liu, Hui; Tang, Xianfeng; Yin, Qingyu; Zhang, Linjun; Wang, Haoyu; Gao, Jing (July 2025, Proceedings of the International Conference on Machine Learning)

Free, publicly-accessible full text available July 13, 2026
A unified combination framework for dependent tests with applications to microbiome association studies

https://doi.org/10.1093/biomtc/ujaf001

Yu, Xiufan; Zhang, Linjun; Srinivasan, Arun; Xie, Min-ge; Xue, Lingzhou (January 2025, Biometrics)

ABSTRACT We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating P-values and also a more recent general method of combining confidence distributions, but makes generalizations to handle dependent tests. The proposed framework ensures rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination method for dependent tests, referred to as the vanilla Cauchy combination in this article, can be viewed as a special case within our framework. Moreover, the proposed framework provides a way to address the problem when the distributional assumptions underlying the vanilla Cauchy combination are violated. Our numerical results demonstrate that ignoring the dependence among the to-be-combined components may lead to a severe size distortion phenomenon. Compared to the existing P-value combination methods, including the vanilla Cauchy combination method and other methods, the proposed combination framework is flexible and can be adapted to handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power. The development is applied to the microbiome association studies, where we aggregate information from multiple existing tests using the same dataset. The combined tests harness the strengths of each individual test across a wide range of alternative spaces, enabling more efficient and meaningful discoveries of vital microbiome associations.
more » « less
Discover and Cure: Concept-aware Mitigation of Spurious Correlation

Wu, Shirley; Yuksekgonul, Mert; Zhang, Linjun; Zou, James (July 2023, 2023 International Conference on Machine Learning)

Full Text Available
FaiREE: Fair classification with finite-sample and distribution-free guarantee.

Li, Puheng; Zou, James; Zhang, Linjun (April 2023, The International Conference on Learning Representations (ICLR) 2023)

Full Text Available
Freeze then Train: Towards Provable Representation Learning under Spurious Correlations and Feature Noise.

Ye, Haotian; Zou, James; Zhang, Linjun (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)

Full Text Available
Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer

https://doi.org/10.1080/01621459.2023.2184373

Li, Sai; Zhang, Linjun; Cai, T. Tony; Li, Hongzhe (April 2023, Journal of the American Statistical Association)

Full Text Available
Reinforcement Learning with Stepwise Fairness Constraints

Deng, Zhun; Sun, He; Wu, Steven; Zhang, Linjun; Parkes, David (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)

Full Text Available
Reinforcement Learning with Stepwise Fairness Constraints

Deng, Zhun; Sun, He; Wu, Steven; Zhang, Linjun; Parkes, David (April 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)

Full Text Available

« Prev Next »

Search for: All records