NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape

https://doi.org/10.1145/3724113

Zhao, Joshua; Bagchi, Saurabh; Avestimehr, Salman; Chan, Kevin; Chaterji, Somali; Dimitriadis, Dimitris; Li, Jiacheng; Li, Ninghui; Nourian, Arash; Roth, Holger (September 2025, ACM Computing Surveys)

Deep learning has shown incredible potential across a wide array of tasks, and accompanied by this growth has been an insatiable appetite for data. However, a large amount of data needed for enabling deep learning is stored on personal devices, and recent concerns on privacy have further highlighted challenges for accessing such data. As a result, federated learning (FL) has emerged as an important privacy-preserving technology that enables collaborative training of machine learning models without the need to send the raw, potentially sensitive, data to a central server. However, the fundamental premise that sending model updates to a server is privacy-preserving only holds if the updates cannot be “reverse engineered” to infer information about the private training data. It has been shown under a wide variety of settings that this privacy premise doesnothold. In this article we provide a comprehensive literature review of the different privacy attacks and defense methods in FL. We identify the current limitations of these attacks and highlight the settings in which the privacy of an FL client can be broken. We further dissect some of the successful industry applications of FL and draw lessons for future successful adoption. We survey the emerging landscape of privacy regulation for FL and conclude with future directions for taking FL toward the cherished goal of generating accurate models while preserving the privacy of the data from its participants.
more » « less
Free, publicly-accessible full text available September 30, 2026
Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity

Lu, Chen-Yi; Derakhshandeh, Kasra; Chaterji, Somali (June 2025, IEEE)

Free, publicly-accessible full text available June 11, 2026
Agile3D: Adaptive Contention- and Content-Aware 3D Object Detection for Embedded GPUs

Wang, Pengcheng; Liu, Zhuoming; Bagchi, Shayok; Xu, Ran; Bagchi, Saurabh; Li, Yin; Chaterji, Somali (June 2025, ACM)

Free, publicly-accessible full text available June 1, 2026
RECON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories

https://doi.org/10.1007/978-3-031-73202-7_17

Lu, Chen-Yi; Agarwal, Shubham; Tanjim, Md Mehrab; Mahadik, Kanak; Rao, Anup; Mitra, Subrata; Saini, Shiv Kumar; Bagchi, Saurabh; Chaterji, Somali (November 2024, Springer Nature Switzerland)

Full Text Available
SensorBFT: Fault-Tolerant Target Localization Using Voronoi Diagrams and Approximate Agreement

https://doi.org/10.1109/ICDCS60910.2024.00026

Bandarupalli, Akhil; Bhat, Adithya; Chaterji, Somali; Reiter, Michael K; Kate, Aniket; Bagchi, Saurabh (July 2024, IEEE)

Full Text Available
FLAIR: Defense against Model Poisoning Attack in Federated Learning

Sharma, Atul; Chen, Wei; Zhao, Joshua; Qiu, Qiang; Bagchi, Saurabh; Chaterji, Somali. (July 2023, ACM ASIA CCS)

Federated learning—multi-party, distributed learning in a decentralized environment—is vulnerable to model poisoning attacks, more so than centralized learning. This is because malicious clients can collude and send in carefully tailored model updates to make the global model inaccurate. This motivated the development of Byzantine-resilient federated learning algorithms, such as Krum, Bulyan, FABA, and FoolsGold. However, a recently developed untargeted model poisoning attack showed that all prior defenses can be bypassed. The attack uses the intuition that simply by changing the sign of the gradient updates that the optimizer is computing, for a set of malicious clients, a model can be diverted from the optima to increase the test error rate. In this work, we develop FLAIR—a defense against this directed deviation attack (DDA), a state-of-the-art model poisoning attack. FLAIR is based on ourintuition that in federated learning, certain patterns of gradient flips are indicative of an attack. This intuition is remarkably stable across different learning algorithms, models, and datasets. FLAIR assigns reputation scores to the participating clients based on their behavior during the training phase and then takes a weighted contribution of the clients. We show that where the existing defense baselines of FABA [IJCAI’19], FoolsGold [Usenix ’20], and FLTrust [NDSS ’21] fail when 20-30% of the clients are malicious, FLAIR provides byzantine-robustness upto a malicious client percentage of 45%. We also show that FLAIR provides robustness against even a white-box version of DDA.
more » « less
Full Text Available
Vega: Drone-based Multi-Altitude Target Detection for Autonomous Surveillance

Bandarupalli, Akhil; Jain, Sarthak; Melachuri, Akash; Pappas, Joseph; Chaterji, Somali. (June 2023, 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT))

UAVs (unmanned aerial vehicles) or drones are promising instruments for video-based surveillance. Various applications of aerial surveillance use object detection programs to detect target objects. In such applications, three parameters influence a drone deployment strategy: the area covered by the drone, the latency of target (object) detection, and the quality of the detection output by the object detector. Previous works have focused on improving Pareto optimality along the area-latency frontier or the area-quality frontier, but not on the combined area-latency-quality frontier, because of which these solutions are sub-optimal for drone-based surveillance. We explore a three way tradeoff between area, latency, and quality in the context of autonomous aerial surveillance of targets in an area using drones with cameras and an object detection program. We propose Vega, a drone deployment framework that captures these tradeoffs to deploy drones efficiently. We make three contributions with Vega. First, we characterize the ability of the state-of-the-art mobile object detector, EfficientDet [CPVR '20], to detect objects from varying drone altitudes using confidence and IoU curves vs. drone altitude. Second, based on these characteristics of the detector, we propose a set of two algorithmic primitives for drone-based maneuvers, namely DroneZoom and DroneCycle. Using these two primitives, we obtain a more optimal Pareto frontier between our three target parameters - coverage area, detection latency, and detection quality for a single drone system. Third, we scale out our findings to a swarm deployment using higher-order Voronoi tessellations, where we control the swarm's spatial density using the Voronoi order to further lower the detection latency while maintaining detection quality.
more » « less
Full Text Available
How to learn collaboratively - Federated learning to peer-to-peer learning and what's at stake.

Sharma, Atul; Zhao, Joshua; Chen, Wei; Qiu, Qiang; Bagchi, Saurabh; Chaterji, Somali (June 2023, 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), “Disrupt 23: Disruptive Ideas and New Interdisciplinary Results” Track)

Standard ML relies on training using a centrally collected dataset, while collaborative learning techniques such as Federated Learning (FL) enable data to remain decentralized at client locations. In FL, a central server coordinates the training process, reducing computation and communication expenses for clients. However, this centralization can lead to server congestion and heightened risk of malicious activity or data privacy breaches. In contrast, Peer-to-Peer Learning (P2PL) is a fully decentralized system where nodes manage both local training and aggregation tasks. While P2PL promotes privacy by eliminating the need to trust a single node, it also results in increased computation and communication costs, along with potential difficulties in achieving consensus among nodes. To address the limitations of both FL and P2PL, we propose a hybrid approach called Hubs-and-Spokes Learning (HSL). In HSL, hubs function similarly to FL servers, maintaining consensus but exerting less control over spokes. This paper argues that HSL’s design allows for greater availability and privacy than FL, while reducing computation and communication costs compared to P2PL. Additionally, HSL maintains consensus and integrity in the learning process.
more » « less
Full Text Available
KRATOS: Context-Aware Cell Type Classification and Interpretation using Joint Dimensionality Reduction and Clustering

https://doi.org/10.1145/3534678.3539455

Zhou, Zihan; Du, Zijia; Chaterji, Somali (October 2022, ACM-KDD)

A common workflow for single-cell RNA-sequencing (sc-RNA-seq) data analysis is to orchestrate a three-step pipeline. First, conduct a dimension reduction of the input cell profile matrix; second, cluster the cells in the latent space; and third, extract the "gene panels" that distinguish a certain cluster from others. This workflow has the primary drawback that the three steps are performed independently, neglecting the dependencies among the steps and among the marker genes or gene panels. In our system, KRATOS, we alter the three-step workflow to a two-step one, where we jointly optimize the first two steps and add the third (interpretability) step to form an integrated sc-RNA-seq analysis pipeline. We show that the more compact workflow of KRATOS extracts marker genes that can better discriminate the target cluster, distilling underlying mechanisms guiding cluster membership. In doing so, KRATOS is significantly better than the two SOTA baselines we compare against, specifically 5.62% superior to Global Counterfactual Explanation (GCE) [ICML-20], and 3.31% better than Adversarial Clustering Explanation (ACE) [ICML-21], measured by the AUROC of a kernel-SVM classifier. We opensource our code and datasets here: https://github.com/icanforce/single-cell-genomics-kratos.
more » « less
Full Text Available
ORION and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs

Mahgoub, Ashraf; Barsallo, Edgardo; Shankar, Karthick; Minocha, Eshaan; Elnikety, Sameh; Bagchi, Saurabh; Chaterji, Somali. (July 2022, USENIX OSDI proceedings)

Serverless applications represented as DAGs have been growing in popularity. For many of these applications, it would be useful to estimate the end-to-end (E2E) latency and to allocate resources to individual functions so as to meet probabilistic guarantees for the E2E latency. This goal has not been met till now due to three fundamental challenges. The first is the high variability and correlation in the execution time of individual functions, the second is the skew in execution times of the parallel invocations, and the third is the incidence of cold starts. In this paper, we introduce ORION to achieve this goal. We first analyze traces from a production FaaS infrastructure to identify three characteristics of serverless DAGs. We use these to motivate and design three features. The first is a performance model that accounts for runtime variabilities and dependencies among functions in a DAG. The second is a method for co-locating multiple parallel invocations within a single VM thus mitigating content-based skew among these invocations. The third is a method for pre-warming VMs for subsequent functions in a DAG with the right look-ahead time. We integrate these three innovations and evaluate ORION on AWS Lambda with three serverless DAG applications. Our evaluation shows that compared to three competing approaches, \name achieves up to 90\% lower P95 latency without increasing \$$ cost, or up to 53\% lower \$$ cost without increasing P95 latency.
more » « less
Full Text Available

« Prev Next »

Search for: All records