NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Better Private Distribution Testing by Leveraging Unverified Auxiliary Data

Aliakbarpour, Maryam; Burudgunte, Arnav; Canonne, Clement; Rubinfeld, Ronitt (June 2025, 38th Annual Conference on Learning Theory (COLT 2025))

We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data ana- lyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, uniformity, identity, and closeness testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information- theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors). Keywords: distribution testing, identity testing, closeness testing, differential privacy, learning- augmented algorithms
more » « less
Free, publicly-accessible full text available June 30, 2026
Better Private Distribution Testing by Leveraging Unverified Auxiliary Data

Aliakbarpour, Maryam; Burudgunte, Arnav; Canonne, Clement; Rubinfeld, Ronitt (June 2025, Theory and Practice of Differential Privacy (TPDP 2025))

We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, uniformity, identity, and closeness testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).
more » « less
Free, publicly-accessible full text available June 2, 2026
Privacy in Metalearning and Multitask Learning: Modeling and Separations

Aliakbarpour, Maryam; Bairaktari, Konstantina; Smith, Adam; Swanberg, Marika; Ullman, Jonathan (May 2025, Proceedings of Machine Learning Research)

Free, publicly-accessible full text available May 1, 2026
Optimal Hypothesis Selection in (Almost) Linear Time

Aliakbarpour, Maryam; Bun, Mark; Smith, Adam (December 2024, NeurIPS 2024)

Free, publicly-accessible full text available December 9, 2025
Optimal Algorithms for Augmented Testing of Discrete Distributions

Aliakbarpour, Maryam; Indyk, Piotr; Rubinfeld, Ronitt; Silwal, Sandeep (December 2024, Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems (NeurIPS 2024))

We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution p, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor’s quality, measured by its total variation distance from p. A key advantage of our algorithms is their adaptability to the precision of the prediction. Specifically, our algorithms can self-adjust their sample complexity based on the accuracy of the available prediction, operating without any prior knowledge of the estimation’s accuracy (i.e. they are consistent). Additionally, we never use more samples than the standard approaches require, even if the predictions provide no meaningful information (i.e. they are also robust). We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal. Furthermore, experimental results show that the performance of our algorithms on real data significantly exceeds our worst-case guarantees for sample complexity, demonstrating the practicality of our approach.
more » « less
Free, publicly-accessible full text available December 10, 2025
Metalearning with Very Few Samples Per Task

Aliakbarpour, Maryam; Bairaktari, Konstantina; Brown, Gavin; Smith, Adam; Srebro, Nathan; Ullman, Jonathan (June 2024, Proceedings of Machine Learning Research)

Full Text Available
Metalearning with Very Few Samples Per Task

Aliakbarpour, Maryam; Bairaktari, Konstantina; Brown, Gavin; Smith, Adam; Srebro, Nathan; Ullman, Jonathan (June 2024, Conference on Learning Theory)
Metalearning with Very Few Samples Per Task

Aliakbarpour, Maryam; Bairaktari, Konstantina; Brown, Gavin; Smith, Adam; Srebro, Nati; Ullman, Jonathan (June 2024, Proceedings of Machine Learning Research)

Full Text Available
Hypothesis Selection with Memory Constraints

Aliakbarpour, Maryam; Bun, Mark; Smith, Adam (February 2024, Neural Information Processing Systems)

Full Text Available
Differentially Private Medians and Interior Points for Non-Pathological Data

Aliakbarpour, Maryam; Silver, Rose; Stenke, Thomas; Ullman, Jonathan (February 2024, Innovations in Theoretical Computer Science)

« Prev Next »

Search for: All records