De-biased two-sample U-statistics with application to conditional distribution testing

Chen, Yuchen; Lei, Jing

doi:10.1007/s10994-024-06719-4

Citation Details

De-biased two-sample U-statistics with application to conditional distribution testing

Abstract In some high-dimensional and semiparametric inference problems involving two populations, the parameter of interest can be characterized by two-sample U-statistics involving some nuisance parameters. In this work we first extend the framework of one-step estimation with cross-fitting to two-sample U-statistics, showing that using an orthogonalized influence function can effectively remove the first order bias, resulting in asymptotically normal estimates of the parameter of interest. As an example, we apply this method and theory to the problem of testing two-sample conditional distributions, also known as strong ignorability. When combined with a conformal-based rank-sum test, we discover that the nuisance parameters can be divided into two categories, where in one category the nuisance estimation accuracy does not affect the testing validity, whereas in the other the nuisance estimation accuracy must satisfy the usual requirement for the test to be valid. We believe these findings provide further insights into and enhance the conformal inference toolbox. more »

Award ID(s):: 2310764

PAR ID:: 10568373

Author(s) / Creator(s):: Chen, Yuchen; Lei, Jing

Publisher / Repository:: Springer Science + Business Media

Date Published:: 2025-01-27

Journal Name:: Machine Learning

Volume:: 114

Issue:: 2

ISSN:: 0885-6125

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1007/s10994-024-06719-4

More Like this