Federated learning (FL) is a popular collaborative learning paradigm, whereby agents with individual datasets can jointly train an ML model. While higher data sharing improves model accuracy and leads to higher payoffs, it also raises costs associated with data acquisition or loss of privacy, causing agents to be strategic about their data contribution. This leads to undesirable behavior at a Nash equilibrium (NE) such as free-riding, resulting in sub-optimal fairness, data sharing, and welfare. To address this, we design MSHAP, a budget-balanced payment mechanism for FL, that admits Nash equilibria under mild conditions, and achieves reciprocal fairness: where each agent's payoff equals her contribution to the collaboration, as measured by the Shapley share. In addition to fairness, we show that the NE under MSHAP has desirable guarantees in terms of accuracy, welfare, and total data collected. We validate our theoretical results through experiments, demonstrating that MSHAP outperforms baselines in terms of fairness and efficiency.
more »
« less
Incentives in federated learning: Equilibria, dynamics, and mechanisms for welfare maximization
Federated learning (FL) has emerged as a powerful scheme to facilitate the collaborative learning of models amongst a set of agents holding their own private data. Although the agents benefit from the global model trained on shared data, by participating in federated learning, they may also incur costs (related to privacy and communication) due to data sharing. In this paper, we model a collaborative FL framework, where every agent attempts to achieve an optimal trade-off between her learning payoff and data sharing cost. We show the existence of Nash equilibrium (NE) under mild assumptions on agents' payoff and costs. Furthermore, we show that agents can discover the NE via best response dynamics. However, some of the NE may be bad in terms of overall welfare for the agents, implying little incentive for some fraction of the agents to participate in the learning. To remedy this, we design a budget-balanced mechanism involving payments to the agents, that ensures that any -mean welfare function of the agents' utilities is maximized at NE. In addition, we introduce a FL protocol FedBR-BG that incorporates our budget-balanced mechanism, utilizing best response dynamics. Our empirical validation on MNIST and CIFAR-10 substantiates our theoretical analysis. We show that FedBR-BG outperforms the basic best-response-based protocol without additional incentivization, the standard federated learning protocol FedAvg, as well as a recent baseline MWFed in terms of achieving superior -mean welfare.
more »
« less
- Award ID(s):
- 2229876
- PAR ID:
- 10662045
- Publisher / Repository:
- 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
- Date Published:
- Page Range / eLocation ID:
- 17811-17831
- Format(s):
- Medium: X
- Location:
- New Orleans, LA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The conventional machine learning (ML) and deep learning (DL) methods use large amount of data to construct desirable prediction models in a central fusion center for recognizing human activities. However, such model training encounters high communication costs and leads to privacy infringement. To address the issues of high communication overhead and privacy leakage, we employed a widely popular distributed ML technique called Federated Learning (FL) that generates a global model for predicting human activities by combining participated agents’ local knowledge. The state-of-the-art FL model fails to maintain acceptable accuracy when there is a large number of unreliable agents who can infuse false model, or, resource-constrained agents that fails to perform an assigned computational task within a given time window. We developed an FL model for predicting human activities by monitoring agent’s contributions towards model convergence and avoiding the unreliable and resource-constrained agents from training. We assign a score to each client when it joins in a network and the score is updated based on the agent’s activities during training. We consider three mobile robots as FL clients that are heterogeneous in terms of their resources such as processing capability, memory, bandwidth, battery-life and data volume. We consider heterogeneous mobile robots for understanding the effects of real-world FL setting in presence of resource-constrained agents. We consider an agent unreliable if it repeatedly gives slow response or infuses incorrect models during training. By disregarding the unreliable and weak agents, we carry-out the local training of the FL process on selected agents. If somehow, a weak agent is selected and started showing straggler issues, we leverage asynchronous FL mechanism that aggregate the local models whenever it receives a model update from the agents. Asynchronous FL eliminates the issue of waiting for a long time to receive model updates from the weak agents. To the end, we simulate how we can track the behavior of the agents through a reward-punishment scheme and present the influence of unreliable and resource-constrained agents in the FL process. We found that FL performs slightly worse than centralized models, if there is no unreliable and resource-constrained agent. However, as the number of malicious and straggler clients increases, our proposed model performs more effectively by identifying and avoiding those agents while recognizing human activities as compared to the stateof-the-art FL and ML approaches.more » « less
-
The rapid growth of data-driven technologies and the emergence of various data-sharing paradigms have underscored the need for efficient and stable data exchange protocols. In any such exchange, agents must carefully balance the benefit of acquiring valuable data against the cost of sharing their own. Ensuring stability in these exchanges is essential to prevent agents—or groups of agents—from departing and conducting local, and potentially more favorable, exchanges among themselves. To address this, we study a model in which agents participate in a data exchange. Each agent has an associated payoff for the data acquired from other agents and incurs a cost when sharing its own data. The net utility of an agent is defined as its payoff minus its cost. We adapt the classical notion of core stability from cooperative game theory to the setting of data exchange. A data exchange is said to be core-stable if no subset of agents has an incentive to deviate to a different exchange. We show that a core-stable data exchange is guaranteed to exist when agents have concave payoff functions and convex cost functions, a setting that is typical in domains such as PAC learning and random discovery models. We further show that relaxing either of these conditions can result in the nonexistence of core-stable data exchanges. We then prove that finding a core-stable data exchange is PPAD-hard, even when the set of potential blocking coalitions is restricted to groups of constant size. To the best of our knowledge, this is the first known PPAD-hardness result for core-like stability guarantees in data economics. Finally, we show that data exchange can be modeled as a balanced n-person game. This immediately yields a pivoting algorithm via Scarf’s theorem from 1967 on the core. We demonstrate through empirical results that this pivoting algorithm performs well in practice.more » « less
-
As a promising distributed machine learning paradigm, Federated Learning (FL) has attracted increasing attention to deal with data silo problems without compromising user privacy. By adopting the classic one-to-multi training scheme (i.e., FedAvg), where the cloud server dispatches one single global model to multiple involved clients, conventional FL methods can achieve collaborative model training without data sharing. However, since only one global model cannot always accommodate all the incompatible convergence directions of local models, existing FL approaches greatly suffer from inferior classification accuracy. To address this issue, we present an efficient FL framework named FedCross, which uses a novel multi-to-multi FL training scheme based on our proposed multi-model cross-aggregation approach. Unlike traditional FL methods, in each round of FL training, FedCross uses multiple middleware models to conduct weighted fusion individually. Since the middleware models used by FedCross can quickly converge into the same flat valley in terms of loss landscapes, the generated global model can achieve a well-generalization. Experimental results on various well-known datasets show that, compared with state-of-the-art FL methods, Fed Cross can significantly improve FL accuracy within both IID and non-IID scenarios without causing additional communication overhead.more » « less
-
null (Ed.)Federated learning (FL) allows a set of agents to collaboratively train a model without sharing their potentially sensitive data. This makes FL suitable for privacy-preserving applications. At the same time, FL is susceptible to adversarial attacks due to decentralized and unvetted data. One important line of attacks against FL is the backdoor attacks. In a backdoor attack, an adversary tries to embed a backdoor functionality to the model during training that can later be activated to cause a desired misclassification. To prevent backdoor attacks, we propose a lightweight defense that requires minimal change to the FL protocol. At a high level, our defense is based on carefully adjusting the aggregation server's learning rate, per dimension and per round, based on the sign information of agents' updates. We first conjecture the necessary steps to carry a successful backdoor attack in FL setting, and then, explicitly formulate the defense based on our conjecture. Through experiments, we provide empirical evidence that supports our conjecture, and we test our defense against backdoor attacks under different settings. We observe that either backdoor is completely eliminated, or its accuracy is significantly reduced. Overall, our experiments suggest that our defense significantly outperforms some of the recently proposed defenses in the literature. We achieve this by having minimal influence over the accuracy of the trained models. In addition, we also provide convergence rate analysis for our proposed scheme.more » « less
An official website of the United States government

