skip to main content


Search for: All records

Award ID contains: 2133169

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Data sets and statistics about groups of individuals are increasingly collected and released, feeding many optimization and learning algorithms. In many cases, the released data contain sensitive information whose privacy is strictly regulated. For example, in the U.S., the census data is regulated under Title 13, which requires that no individual be identified from any data released by the Census Bureau. In Europe, data release is regulated according to the General Data Protection Regulation, which addresses the control and transfer of personal data. Differential privacy has emerged as the de-facto standard to protect data privacy. In a nutshell, differentially private algorithms protect an individual’s data by injecting random noise into the output of a computation that involves such data. While this process ensures privacy, it also impacts the quality of data analysis, and, when private data sets are used as inputs to complex machine learning or optimization tasks, they may produce results that are fundamentally different from those obtained on the original data and even rise unintended bias and fairness concerns. In this talk, I will first focus on the challenge of releasing privacy-preserving data sets for complex data analysis tasks. I will introduce the notion of Constrained-based Differential Privacy (C-DP), which allows casting the data release problem to an optimization problem whose goal is to preserve the salient features of the original data. I will review several applications of C-DP in the context of very large hierarchical census data, data streams, energy systems, and in the design of federated data-sharing protocols. Next, I will discuss how errors induced by differential privacy algorithms may propagate within a decision problem causing biases and fairness issues. This is particularly important as privacy-preserving data is often used for critical decision processes, including the allocation of funds and benefits to states and jurisdictions, which ideally should be fair and unbiased. Finally, I will conclude with a roadmap to future work and some open questions. 
    more » « less
  2. Fairness and robustness are two important notions of learning models. Fairness ensures that models do not disproportionately harm (or benefit) some groups over others, while robustness measures the models' resilience against small input perturbations. While equally important properties, this paper illustrates a dichotomy between fairness and robustness, and analyzes when striving for fairness decreases the model robustness to adversarial samples. The reported analysis sheds light on the factors causing such contrasting behavior, suggesting that distance to the decision boundary across groups as a key factor. Experiments on non-linear models and different architectures validate the theoretical findings. In addition to the theoretical analysis, the paper also proposes a simple, yet effective, solution to construct models achieving good tradeoffs between fairness and robustness. 
    more » « less
    Free, publicly-accessible full text available August 1, 2025
  3. Free, publicly-accessible full text available May 29, 2025
  4. This paper analyzes the privacy of traditional Statistical Disclosure Control (SDC) systems under a differential privacy interpretation. SDCs, such as cell suppression and swapping, promise to safeguard the confidentiality of data and are routinely adopted in data analyses with profound societal and economic impacts. Through a formal analysis and empirical evaluation of demographic data from real households in the U.S., the paper shows that widely adopted SDC systems not only induce vastly larger privacy losses than classical differential privacy mechanisms, but, they may also come at a cost of larger accuracy and fairness.

     
    more » « less
    Free, publicly-accessible full text available March 25, 2025