NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Finite Population Regression Adjustment and Non-asymptotic Guarantees for Treatment Effect Estimation

Ghadiri, Mehrdad; Arbour, David; Mai, Tung; Musco, Cameron; Rao, Anup B (December 2023, Conference on Neural Information Processing Systems (NeurIPS) 2023)

Full Text Available
Optimal Sketching Bounds for Sparse Linear Regression

Mai, Tung; Munteanu, Alexander; Musco, Cameron; Rao, Anup B.; Schwiegelshohn, Chris; Woodruff, David P. (January 2023, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Coresets for Classification - Simplified and Strengthened

Mai, Tung; Musco, Cameron; Rao, Anup (December 2021, Advances in neural information processing systems)

We give relative error coresets for training linear classifiers with a broad class of loss functions, including the logistic loss and hinge loss. Our construction achieves $$(1\pm \epsilon)$$ relative error with $$\tilde O(d \cdot \mu_y(X)^2/\epsilon^2)$$ points, where $$\mu_y(X)$$ is a natural complexity measure of the data matrix $$X \in \mathbb{R}^{n \times d}$$ and label vector $$y \in \{-1,1\}^n$$, introduced in Munteanu et al. 2018. Our result is based on subsampling data points with probabilities proportional to their \textit{$$\ell_1$$ Lewis weights}. It significantly improves on existing theoretical bounds and performs well in practice, outperforming uniform subsampling along with other importance sampling methods. Our sampling distribution does not depend on the labels, so can be used for active learning. It also does not depend on the specific loss function, so a single coreset can be used in multiple training scenarios.
more » « less
Full Text Available
Sample Constrained Treatment Effect Estimation

Addanki, Raghavendra; Arbour, David; Mai, Tung; Musco, Cameron; Rao, Anup B. (January 2022, Conference on Neural Information Processing Systems (NeurIPS))

Full Text Available
Fundamental Tradeoffs in Distributionally Adversarial Training

Mehrabi, Mohammad; Javanmard, Adel; Rossi, Ryan A.; Rao, Anup; Mai, Tung (January 2021, International Conference on Machine Learning, PMLR)

Full Text Available
Machine Unlearning via Algorithmic Stability

Ullah, Enayat; Mai, Tung; Rao, Anup; Rossi, Ryan A.; Arora, Raman (January 2021, Proceedings of Machine Learning Research)

We study the problem of machine unlearning and identify a notion of algorithmic stability, Total Variation (TV) stability, which we argue, is suitable for the goal of exact unlearning. For convex risk minimization problems, we design TV-stable algorithms based on noisy Stochastic Gradient Descent (SGD). Our key contribution is the design of corresponding efficient unlearning algorithms, which are based on constructing a near-maximal coupling of Markov chains for the noisy SGD procedure. To understand the trade-offs between accuracy and unlearning efficiency, we give upper and lower bounds on excess empirical and populations risk of TV stable algorithms for convex risk minimization. Our techniques generalize to arbitrary non-convex functions, and our algorithms are differentially private as well.
more » « less
Full Text Available
Graph Neural Networks with Heterophily

Zhu, Jiong; Rossi, Ryan A.; Rao, Anup B.; Mai, Tung; Lipka, Nedim; Ahmed, Nesreen K.; Koutra, Danai (February 2021, Proceedings of the AAAI Conference on Artificial Intelligence)
null (Ed.)
Full Text Available

Search for: All records