skip to main content


Search for: All records

Creators/Authors contains: "Jha, Somesh"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 9, 2024
  2. Proper communication is key to the adoption and implementation of differential privacy (DP). In this work, we designed explanative illustrations of three DP models (Central DP, Local DP, Shuffler DP) to help laypeople conceptualize how random noise is added to protect individuals’ privacy and preserve group utility. Following a pilot survey and an interview, we conducted an online experiment ( N = 300) exploring participants’ comprehension, privacy and utility perception, and data-sharing decisions across the three DP models. We obtained empirical evidence showing participants’ acceptance of the Shuffler DP model for data privacy protection. We discuss the implications of our findings.

     
    more » « less
    Free, publicly-accessible full text available September 1, 2024
  3. Free, publicly-accessible full text available July 1, 2024
  4. There is great demand for scalable, secure, and efficient privacy-preserving machine learning models that can be trained over distributed data. While deep learning models typically achieve the best results in a centralized non-secure setting, different models can excel when privacy and communication constraints are imposed. Instead, tree-based approaches such as XGBoost have attracted much attention for their high performance and ease of use; in particular, they often achieve state-of-the-art results on tabular data. Consequently, several recent works have focused on translating Gradient Boosted Decision Tree (GBDT) models like XGBoost into federated settings, via cryptographic mechanisms such as Homomorphic Encryption (HE) and Secure Multi-Party Computation (MPC). However, these do not always provide formal privacy guarantees, or consider the full range of hyperparameters and implementation settings. In this work, we implement the GBDT model under Differential Privacy (DP). We propose a general framework that captures and extends existing approaches for differentially private decision trees. Our framework of methods is tailored to the federated setting, and we show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy. 
    more » « less
  5. Machine learning and logical reasoning have been the two foundational pillars of Artificial Intelligence (AI) since its inception, and yet, until recently the interactions between these two fields have been relatively limited. Despite their individual success and largely inde- pendent development, there are new problems on the horizon that seem solvable only via a combination of ideas from these two fields of AI. These problems can be broadly char- acterized as follows: how can learning be used to make logical reasoning and synthesis/ verification engines more efficient and powerful, and in the reverse direction, how can we use reasoning to improve the accuracy, generalizability, and trustworthiness of learning. In this perspective paper, we address the above-mentioned questions with an emphasis on certain paradigmatic trends at the intersection of learning and reasoning. Our intent here is not to be a comprehensive survey of all the ways in which learning and reasoning have been combined in the past. Rather we focus on certain recent paradigms where corrective feedback loops between learning and reasoning seem to play a particularly important role. Specifically, we observe the following three trends: first, the use of learning techniques (especially, reinforcement learning) in sequencing, selecting, and initializing proof rules in solvers/provers; second, combinations of inductive learning and deductive reasoning in the context of program synthesis and verification; and third, the use of solver layers in providing corrective feedback to machine learning models in order to help improve their accuracy, generalizability, and robustness with respect to partial specifications or domain knowledge. We believe that these paradigms are likely to have significant and dramatic impact on AI and its applications for a long time to come 
    more » « less
  6. This paper investigates an adversary's ease of attack in generating adversarial examples for real-world scenarios. We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i.e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in not only white-box, but also black-box hard-label scenarios, so that the adversary can attack proprietary models. In this work, we propose GRAPHITE, an efficient and general framework for generating attacks that satisfy the above three key requirements. GRAPHITE takes advantage of transform-robustness, a metric based on expectation over transforms (EoT), to automatically generate small masks and optimize with gradient-free optimization. GRAPHITE is also flexible as it can easily trade-off transform-robustness, perturbation size, and query count in black-box settings. On a GTSRB model in a hard-label black-box setting, we are able to find attacks on all possible 1,806 victim-target class pairs with averages of 77.8% transform-robustness, perturbation size of 16.63% of the victim images, and 126K queries per pair. For digital-only attacks where achieving transform-robustness is not a requirement, GRAPHITE is able to find successful small-patch attacks with an average of only 566 queries for 92.2% of victim-target pairs. GRAPHITE is also able to find successful attacks using perturbations that modify small areas of the input image against PatchGuard, a recently proposed defense against patch-based attacks. 
    more » « less