skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Personalized Federated Learning with Attention-Based Client Selection
Personalized Federated Learning (PFL) relies on collective data knowledge to build customized models. However, non-IID data between clients poses significant challenges, as collaborating with clients who have diverse data distributions can harm local model performance, especially with limited training data. To address this issue, we propose FedACS, a new PFL algorithm with an Attention-based Client Selection mechanism. FedACS integrates an attention mechanism to enhance collaboration among clients with similar data distributions and mitigate the data scarcity issue. It prioritizes and allocates resources based on data similarity. We further establish the theoretical convergence behavior of FedACS. Experiments on CIFAR10 and FMNIST validate FedACS’s superiority, showcasing its potential to advance personalized federated learning. By tackling non-IID data challenges and data scarcity, FedACS offers promising advances in personalized federated learning.  more » « less
Award ID(s):
2228534 2223769 2154962 2144209 2006844 2143559 2313110 2033671
PAR ID:
10514280
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-4485-1
Page Range / eLocation ID:
6930 to 6934
Format(s):
Medium: X
Location:
Seoul, Korea, Republic of
Sponsoring Org:
National Science Foundation
More Like this
  1. Personalized Federated Learning (pFL) has emerged as a promising solution to tackle data heterogeneity across clients in FL. However, existing pFL methods either (1) introduce high computation and communication costs or (2) overfit to local data, which can be limited in scope and vulnerable to evolved test samples with natural distribution shifts. In this paper, we propose PERADA, a parameter-efficient pFL framework that reduces communication and computational costs and exhibits superior generalization performance, especially under test-time distribution shifts. PERADA reduces the costs by leveraging the power of pretrained models and only updates and communicates a small number of additional parameters from adapters. PERADA achieves high generalization by regularizing each client’s personalized adapter with a global adapter, while the global adapter uses knowledge distillation to aggregate generalized information from all clients. Theoretically, we provide generalization bounds of PERADA, and we prove its convergence to stationary points under non-convex settings. Empirically, PERADA demonstrates higher personalized performance (+4.85% on CheXpert) and enables better out-of-distribution generalization (+5.23% on CIFAR-10-C) on different datasets across natural and medical domains compared with baselines, while only updating 12.6% of parameters per model. Our code is available at https://github.com/NVlabs/PerAda. 
    more » « less
  2. Data heterogeneity across clients in federated learning (FL) settings is a widely acknowledged challenge. In response, personalized federated learning (PFL) emerged as a framework to curate local models for clients' tasks. In PFL, a common strategy is to develop local and global models jointly - the global model (for generalization) informs the local models, and the local models (for personalization) are aggregated to update the global model. A key observation is that if we can improve the generalization ability of local models, then we can improve the generalization of global models, which in turn builds better personalized models. In this work, we consider class imbalance, an overlooked type of data heterogeneity, in the classification setting. We propose FedNH, a novel method that improves the local models' performance for both personalization and generalization by combining the uniformity and semantics of class prototypes. FedNH initially distributes class prototypes uniformly in the latent space and smoothly infuses the class semantics into class prototypes. We show that imposing uniformity helps to combat prototype collapse while infusing class semantics improves local models. Extensive experiments were conducted on popular classification datasets under the cross-device setting. Our results demonstrate the effectiveness and stability of our method over recent works. 
    more » « less
  3. A central theme in federated learning (FL) is the fact that client data distributions are often not independent and identically distributed (IID), which has strong implications on the training process. While most existing FL algorithms focus on the conventional non-IID setting of class imbalance or missing classes across clients, in practice, the distribution differences could be more complex, e.g., changes in class conditional (domain) distributions. In this paper, we consider this complex case in FL wherein each client has access to only one domain distribution. For tasks such as domain generalization, most existing learning algorithms require access to data from multiple clients (i.e., from multiple domains) during training, which is prohibitive in FL. To address this challenge, we propose a federated domain translation method that generates pseudodata for each client which could be useful for multiple downstream learning tasks. We empirically demonstrate that our translation model is more resource-efficient (in terms of both communication and computation) and easier to train in an FL setting than standard domain translation methods. Furthermore, we demonstrate that the learned translation model enables use of state-of-the-art domain generalization methods in a federated setting, which enhances accuracy and robustness to increases in the synchronization period compared to existing methodology. 
    more » « less
  4. Federated learning (FL) is an efficient learning framework that assists distributed machine learning when data cannot be shared with a centralized server. Recent advancements in FL use predefined architecture-based learning for all clients. However, given that clients’ data are invisible to the server and data distributions are non-identical across clients, a predefined architecture discovered in a centralized setting may not be an optimal solution for all the clients in FL. Motivated by this challenge, we introduce SPIDER, an algorithmic frame- work that aims to Search PersonalIzed neural architecture for feDERated learning. SPIDER is designed based on two unique features: (1) alternately optimizing one architecture- homogeneous global model in a generic FL manner and architecture-heterogeneous local models that are connected to the global model by weight-sharing-based regularization, (2) achieving architecture-heterogeneous local models by a perturbation-based neural architecture search method. Experimental results demonstrate superior prediction performance compared with other state-of-the-art personalization methods. Code is available at https://github.com/ErumMushtaq/SPIDER.git. 
    more » « less
  5. Federated learning is promising for its capability to collaboratively train models with multiple clients without accessing their data, but vulnerable when clients’ data distributions diverge from each other. This divergence further leads to a dilemma: “Should we prioritize the learned model’s generic performance (for future use at the server) or its personalized performance (for each client)?” These two, seemingly competing goals have divided the community to focus on one or the other, yet in this paper we show that it is possible to approach both at the same time. Concretely, we propose a novel federated learning framework that explicitly decouples a model’s dual duties with two prediction tasks. On the one hand, we introduce a family of losses that are robust to non-identical class distributions, enabling clients to train a generic predictor with a consistent objective across them. On the other hand, we formulate the personalized predictor as a lightweight adaptive module that is learned to minimize each client’s empirical risk on top of the generic predictor. With this two-loss, two-predictor framework which we name Federated Robust Decoupling (FED-ROD), the learned model can simultaneously achieve state-of- the-art generic and personalized performance, essentially bridging the two tasks. 
    more » « less