NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The variational method of moments

https://doi.org/10.1093/jrsssb/qkad025

Bennett, Andrew; Kallus, Nathan (April 2023, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Abstract The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression. We introduce a very general class of estimators called the variational method of moments (VMM), motivated by a variational minimax reformulation of optimally weighted generalized method of moments for finite sets of moments. VMM controls infinitely for many moments characterized by flexible function classes such as neural nets and kernel methods, while provably maintaining statistical efficiency unlike existing related minimax estimators. We also develop inference algorithms and demonstrate the empirical strengths of VMM estimation and inference in experiments.
more » « less
Synthetic Control Analysis of the Short-Term Impact of New York State’s Bail Elimination Act on Aggregate Crime

https://doi.org/10.1080/2330443X.2023.2267617

Zhou, Angela; Koo, Andrew; Kallus, Nathan; Ropac, Rene; Peterson, Richard; Koppel, Stephen; Bergin, Tiffany (October 2023, Statistics and Public Policy)

Full Text Available
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning

https://doi.org/10.1093/biomet/asad059

Kallus, Nathan; Uehara, Masatoshi (September 2023, Biometrika)

We study the efficient off-policy evaluation of natural stochastic policies, which are defined in terms of deviations from the unknown behaviour policy. This is a departure from the literature on off-policy evaluation that largely consider the evaluation of explicitly specified policies. Crucially, offline reinforcement learning with natural stochastic policies can help alleviate issues of weak overlap, lead to policies that build upon current practice, and improve policies' implementability in practice. Compared with the classic case of a prespecified evaluation policy, when evaluating natural stochastic policies, the efficiency bound, which measures the best-achievable estimation error, is inflated since the evaluation policy itself is unknown. In this paper we derive the efficiency bounds of two major types of natural stochastic policies: tilting policies and modified treatment policies. We then propose efficient nonparametric estimators that attain the efficiency bounds under lax conditions and enjoy a partial double robustness property.
more » « less
Full Text Available
Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

https://doi.org/10.1287/opre.2021.0781

Bennett, Andrew; Kallus, Nathan (September 2023, Operations Research)

In applications of offline reinforcement learning to observational data, such as in healthcare or education, a general concern is that observed actions might be affected by unobserved factors, inducing confounding and biasing estimates derived assuming a perfect Markov decision process (MDP) model. In “Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes,” A. Bennett and N. Kallus tackle this by considering off-policy evaluation in a partially observed MDP (POMDP). Specifically, they consider estimating the value of a given target policy in an unknown POMDP, given observations of trajectories generated by a different and unknown policy, which may depend on the unobserved states. They consider both when the target policy value can be identified the observed data and, given identification, how best to estimate it. Both these problems are addressed by extending the framework of proximal causal inference to POMDP settings, using sequences of so-called bridge functions. This results in a novel framework for off-policy evaluation in POMDPs that they term proximal reinforcement learning, which they validate in various empirical settings.
more » « less
Full Text Available
Treatment Effect Risk: Bounds and Inference

https://doi.org/10.1287/mnsc.2023.4819

Kallus, Nathan (August 2023, Management Science)

Because the average treatment effect (ATE) measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population. Assessing such risk is difficult, however, because any one individual treatment effect (ITE) is never observed, so the 10% worst-affected cannot be identified, whereas distributional treatment effects only compare the first deciles within each treatment group, which does not correspond to any 10% subpopulation. In this paper, we consider how to nonetheless assess this important risk measure, formalized as the conditional value at risk (CVaR) of the ITE distribution. We leverage the availability of pretreatment covariates and characterize the tightest possible upper and lower bounds on ITE-CVaR given by the covariate-conditional average treatment effect (CATE) function. We then proceed to study how to estimate these bounds efficiently from data and construct confidence intervals. This is challenging even in randomized experiments as it requires understanding the distribution of the unknown CATE function, which can be very complex if we use rich covariates to best control for heterogeneity. We develop a debiasing method that overcomes this and prove it enjoys favorable statistical properties even when CATE and other nuisances are estimated by black box machine learning or even inconsistently. Studying a hypothetical change to French job search counseling services, our bounds and inference demonstrate a small social benefit entails a negative impact on a substantial subpopulation. This paper was accepted by J. George Shanthikumar, data science. Funding: This work was supported by the Division of Information and Intelligent Systems [Grant 1939704]. Supplemental Material: The data files and online appendices are available at https://doi.org/10.1287/mnsc.2023.4819 .
more » « less
Full Text Available
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

Kaiwen Wang, Nathan Kallus (July 2023, International Conference on Machine Learning)

Full Text Available
Minimax Instrumental Variable Regression and L2 Convergence Guarantees without Identification or Closedness

Andrew Bennett, Nathan Kallus (July 2023, Proceedings of Machine Learning Research)

Full Text Available
Inference on Strongly Identified Functionals of Weakly Identified Functions

Andrew Bennett, Nathan Kallus (July 2023, Proceedings of Machine Learning Research)

Full Text Available
Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR

Kaiwen Wang, Nathan Kallus (January 2023, International Conference on Machine Learning)

Full Text Available
Provable Safe Reinforcement Learning with Binary Feedback

Andrew Bennett, Dipendra Misra (January 2023, International Conference on Artificial Intelligence and Statistics)

Full Text Available

« Prev Next »

Search for: All records