NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron; Sheldon, Daniel (December 2025, 38th Conference on Neural Information Processing Systems (NeurIPS 2024).)

Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries that have been measured privately. Reconstruction is an important subproblem for such mecha- nisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms.
more » « less
Free, publicly-accessible full text available December 15, 2026
Efficient and Private Marginal Reconstruction with Local Non-Negativity

Mullins, Brett; Fuentes, Miguel; Xiao, Yingtai; Kifer, Daniel; Musco, Cameron N; Sheldon, Daniel (December 2024, NeurIPS)

Free, publicly-accessible full text available December 13, 2025
PatchRefineNet: Improving Binary Segmentation by Incorporating Signals from Optimal Patch-wise Binarization

https://doi.org/10.1109/WACV57701.2024.00139

Nagendra, Savinay; Kifer, Daniel (January 2024, IEEE)

Full Text Available
An Optimal and Scalable Matrix Mechanism for Noisy Marginals under Convex Loss Functions

Xiao, Yingtai; He, Guanlin; Zhang, Danfeng; Kifer, Daniel (December 2023, NeuRIPS)

Full Text Available
Answering Private Linear Queries Adaptively Using the Common Mechanism

https://doi.org/10.14778/3594512.3594519

Xiao, Yingtai; Wang, Guanhong; Zhang, Danfeng; Kifer, Daniel (April 2023, Proceedings of the VLDB Endowment)

When analyzing confidential data through a privacy filter, a data scientist often needs to decide which queries will best support their intended analysis. For example, an analyst may wish to study noisy two-way marginals in a dataset produced by a mechanism M 1 . But, if the data are relatively sparse, the analyst may choose to examine noisy one-way marginals, produced by a mechanism M 2 , instead. Since the choice of whether to use M 1 or M 2 is data-dependent, a typical differentially private workflow is to first split the privacy loss budget ρ into two parts: ρ 1 and ρ 2 , then use the first part ρ 1 to determine which mechanism to use, and the remainder ρ 2 to obtain noisy answers from the chosen mechanism. In a sense, the first step seems wasteful because it takes away part of the privacy loss budget that could have been used to make the query answers more accurate. In this paper, we consider the question of whether the choice between M 1 and M 2 can be performed without wasting any privacy loss budget. For linear queries, we propose a method for decomposing M 1 and M 2 into three parts: (1) a mechanism M * that captures their shared information, (2) a mechanism M′1 that captures information that is specific to M 1 , (3) a mechanism M′2 that captures information that is specific to M 2 . Running M * and M′ 1 together is completely equivalent to running M 1 (both in terms of query answer accuracy and total privacy cost ρ ). Similarly, running M * and M′ 2 together is completely equivalent to running M 2 . Since M * will be used no matter what, the analyst can use its output to decide whether to subsequently run M ′ 1 (thus recreating the analysis supported by M 1 )or M′ 2 (recreating the analysis supported by M 2 ), without wasting privacy loss budget.
more » « less
Full Text Available
Free gap estimates from the exponential mechanism, sparse vector, noisy max and related algorithms

https://doi.org/10.1007/s00778-022-00728-2

Ding, Zeyu; Wang, Yuxin; Xiao, Yingtai; Wang, Guanhong; Zhang, Danfeng; Kifer, Daniel (February 2022, The VLDB Journal)

Full Text Available
DPGen: Automated Program Synthesis for Differential Privacy

https://doi.org/10.1145/3460120.3484781

Wang, Yuxin; Ding, Zeyu; Xiao, Yingtai; Kifer, Daniel; Zhang, Danfeng (November 2021, Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security)

Full Text Available
Optimizing fitness-for-use of differentially private linear queries

https://doi.org/10.14778/3467861.3467864

Xiao, Yingtai; Ding, Zeyu; Wang, Yuxin; Zhang, Danfeng; Kifer, Daniel (June 2021, Proceedings of the VLDB Endowment)
null (Ed.)
In practice, differentially private data releases are designed to support a variety of applications. A data release is fit for use if it meets target accuracy requirements for each application. In this paper, we consider the problem of answering linear queries under differential privacy subject to per-query accuracy constraints. Existing practical frameworks like the matrix mechanism do not provide such fine-grained control (they optimize total error, which allows some query answers to be more accurate than necessary, at the expense of other queries that become no longer useful). Thus, we design a fitness-for-use strategy that adds privacy-preserving Gaussian noise to query answers. The covariance structure of the noise is optimized to meet the fine-grained accuracy requirements while minimizing the cost to privacy.
more » « less
Full Text Available
Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

https://doi.org/10.2478/popets-2021-0008

Lee, Jaewoo; Kifer, Daniel (January 2021, Proceedings on Privacy Enhancing Technologies)
null (Ed.)
Abstract Recent work on Renyi Differential Privacy has shown the feasibility of applying differential privacy to deep learning tasks. Despite their promise, however, differentially private deep networks often lag far behind their non-private counterparts in accuracy, showing the need for more research in model architectures, optimizers, etc. One of the barriers to this expanded research is the training time — often orders of magnitude larger than training non-private networks. The reason for this slowdown is a crucial privacy-related step called “per-example gradient clipping” whose naive implementation undoes the benefits of batch training with GPUs. By analyzing the back-propagation equations we derive new methods for per-example gradient clipping that are compatible with auto-differeniation (e.g., in Py-Torch and TensorFlow) and provide better GPU utilization. Our implementation in PyTorch showed significant training speed-ups (by factors of 54x - 94x for training various models with batch sizes of 128). These techniques work for a variety of architectural choices including convolutional layers, recurrent networks, attention, residual blocks, etc.
more » « less
Full Text Available
DPGen: Automated Program Synthesis for Differential Privacy

Wang, Yuxin; Ding, Zeyu; Xiao, Yingtai; Kifer, Daniel; Zhang, Danfeng (January 2021, The ACM Conference on Computer and Communications Security (CCS))
null (Ed.)
Differential privacy has become a de facto standard for releasing data in a privacy-preserving way. Creating a differentially private algorithm is a process that often starts with a noise-free (nonprivate) algorithm. The designer then decides where to add noise, and how much of it to add. This can be a non-trivial process – if not done carefully, the algorithm might either violate differential privacy or have low utility. In this paper, we present DPGen, a program synthesizer that takes in non-private code (without any noise) and automatically synthesizes its differentially private version (with carefully calibrated noise). Under the hood, DPGen uses novel algorithms to automatically generate a sketch program with candidate locations for noise, and then optimize privacy proof and noise scales simultaneously on the sketch program. Moreover, DPGen can synthesize sophisticated mechanisms that adaptively process queries until a specified privacy budget is exhausted. When evaluated on standard benchmarks, DPGen is able to generate differentially private mechanisms that optimize simple utility functions within 120 seconds. It is also powerful enough to synthesize adaptive privacy mechanisms.
more » « less
Full Text Available

« Prev Next »

Search for: All records