skip to main content

Title: Federated Bandit: A Gossiping Approach
In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm \textttGossip\_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that \textttGossip\_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(\max\ \textttpoly (N,M) łog T, \textttpoly (N,M)łog_łambda_2^-1 N\ ) for all N agents, where more » łambda_2\in(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose \textttFed\_UCB, a differentially private version of \textttGossip\_UCB, in which the agents preserve ε-differential privacy of their local data while achieving O(\max \\frac\textttpoly (N,M) ε łog^2.5 T, \textttpoly (N,M) (łog_łambda_2^-1 N + łog T) \ ) regret. « less
; ; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the ACM on Measurement and Analysis of Computing Systems
Page Range or eLocation-ID:
1 to 29
Sponsoring Org:
National Science Foundation
More Like this
  1. We investigate robust data aggregation in a multi-agent online learning setting. In reality, multiple online learning agents are often deployed to perform similar tasks and receive similar feedback. We study how agents can improve their collective performance by sharing information among each other. In this paper, we formulate the ε-multi-player multi-armed bandit problem, in which a set of M players that have similar reward distributions for each arm play concurrently. We develop an upper confidence bound-based algorithm that adaptively aggregates rewards collected by different players. To our best knowledge, we are the first to develop such a scheme in amore »multi-player bandit learning setting. We show that under the assumption that pairwise distances between the means of the player-dependent distributions for each arm are small, we improve the (collective) regret bound by nearly a factor of M , in comparison with a baseline algorithm in which the players learn individually using the UCB-1 algorithm (Auer et al., 2002). Our algorithm also exhibits a fallback guarantee, namely, if our task similarity assumption fails to hold, our algorithm still has a performance guarantee that cannot be worse than the baseline by a constant factor. Empirically, we validate our algorithm on synthetic data.« less
  2. Federated multi-armed bandits (FMAB) is a new bandit paradigm that parallels the federated learning (FL) framework in supervised learning. It is inspired by practical applications in cognitive radio and recommender systems, and enjoys features that are analogous to FL. This paper proposes a general framework of FMAB and then studies two specific federated bandit models. We first study the approximate model where the heterogeneous local models are random realizations of the global model from an unknown distribution. This model introduces a new uncertainty of client sampling, as the global model may not be reliably learned even if the finite localmore »models are perfectly known. Furthermore, this uncertainty cannot be quantified a priori without knowledge of the suboptimality gap. We solve the approximate model by proposing Federated Double UCB (Fed2-UCB), which constructs a novel “double UCB” principle accounting for uncertainties from both arm and client sampling. We show that gradually admitting new clients is critical in achieving an O(log(T)) regret while explicitly considering the communication loss. The exact model, where the global bandit model is the exact average of heterogeneous local models, is then studied as a special case. We show that, somewhat surprisingly, the order-optimal regret can be achieved independent of the number of clients with a careful choice of the update periodicity. Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and demonstrate the effectiveness and efficiency of the proposed algorithms.« less
  3. We study the communication cost (or message complexity) of fundamental distributed symmetry breaking problems, namely, coloring and MIS. While significant progress has been made in understanding and improving the running time of such problems, much less is known about the message complexity of these problems. In fact, all known algorithms need at least Ω(m) communication for these problems, where m is the number of edges in the graph. We addressthe following question in this paper: can we solve problems such as coloring and MIS using sublinear, i.e., o(m) communication, and if sounder what conditions? In a classical result, Awerbuch, Goldreich,more »Peleg, and Vainish [JACM 1990] showed that fundamental global problems such asbroadcast and spanning tree construction require at least o(m) messages in the KT-1 Congest model (i.e., Congest model in which nodes have initial knowledge of the neighbors' ID's) when algorithms are restricted to be comparison-based (i.e., algorithms inwhich node ID's can only be compared). Thirty five years after this result, King, Kutten, and Thorup [PODC 2015] showed that onecan solve the above problems using Õ(n) messages (n is the number of nodes in the graph) in Õ(n) rounds in the KT-1 Congest model if non-comparison-based algorithms are permitted. An important implication of this result is that one can use the synchronous nature of the KT-1 Congest model, using silence to convey information,and solve any graph problem using non-comparison-based algorithms with Õ(n) messages, but this takes an exponential number of rounds. In the asynchronous model, even this is not possible. In contrast, much less is known about the message complexity of local symmetry breaking problems such as coloring and MIS. Our paper fills this gap by presenting the following results. Lower bounds: In the KT-1 CONGEST model, we show that any comparison-based algorithm, even a randomized Monte Carlo algorithm with constant success probability, requires Ω(n 2) messages in the worst case to solve either (△ + 1)-coloring or MIS, regardless of the number of rounds. We also show that Ω(n) is a lower bound on the number ofmessages for any (△ + 1)-coloring or MIS algorithm, even non-comparison-based, and even with nodes having initial knowledge of up to a constant radius. Upper bounds: In the KT-1 CONGEST model, we present the following randomized non-comparison-based algorithms for coloring that, with high probability, use o(m) messages and run in polynomially many rounds.(a) A (△ + 1)-coloring algorithm that uses Õ(n1.5) messages, while running in Õ(D + √ n) rounds, where D is the graph diameter. Our result also implies an asynchronous algorithm for (△ + 1)-coloring with the same message bound but running in Õ(n) rounds. (b) For any constantε > 0, a (1+ε)△-coloring algorithm that uses Õ(n/ε 2 ) messages, while running in Õ(n) rounds. If we increase our input knowledge slightly to radius 2, i.e.,in the KT-2 CONGEST model, we obtain:(c) A randomized comparison-based MIS algorithm that uses Õ(n 1.5) messages. while running in Õ( √n) rounds. While our lower bound results can be viewed as counterparts to the classical result of Awerbuch, Goldreich, Peleg, and Vainish [JACM 90], but for local problems, our algorithms are the first-known algorithms for coloring and MIS that take o(m) messages and run in polynomially many rounds.« less
  4. This paper focuses on showing time-message trade-offs in distributed algorithms for fundamental problems such as leader election, broadcast, spanning tree (ST), minimum spanning tree (MST), minimum cut, and many graph verification problems. We consider the synchronous CONGEST distributed computing model and assume that each node has initial knowledge of itself and the identifiers of its neighbors - the so-called KT_1 model - a well-studied model that also naturally arises in many applications. Recently, it has been established that one can obtain (almost) singularly optimal algorithms, i.e., algorithms that have simultaneously optimal time and message complexity (up to polylogarithmic factors), formore »many fundamental problems in the standard KT_0 model (where nodes have only local knowledge of themselves and not their neighbors). The situation is less clear in the KT_1 model. In this paper, we present several new distributed algorithms in the KT_1 model that trade off between time and message complexity. Our distributed algorithms are based on a uniform and general approach which involves constructing a sparsified spanning subgraph of the original graph - called a danner - that trades off the number of edges with the diameter of the sparsifier. In particular, a key ingredient of our approach is a distributed randomized algorithm that, given a graph G and any delta in [0,1], with high probability constructs a danner that has diameter O~(D + n^{1-delta}) and O~(min{m,n^{1+delta}}) edges in O~(n^{1-delta}) rounds while using O~(min{m,n^{1+delta}}) messages, where n, m, and D are the number of nodes, edges, and the diameter of G, respectively. Using our danner construction, we present a family of distributed randomized algorithms for various fundamental problems that exhibit a trade-off between message and time complexity and that improve over previous results. Specifically, we show the following results (all hold with high probability) in the KT_1 model, which subsume and improve over prior bounds in the KT_1 model (King et al., PODC 2014 and Awerbuch et al., JACM 1990) and the KT_0 model (Kutten et al., JACM 2015, Pandurangan et al., STOC 2017 and Elkin, PODC 2017): 1) Leader Election, Broadcast, and ST. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,1]. 2) MST and Connectivity. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. In particular, for delta = 0.5 we obtain a distributed MST algorithm that runs in optimal O~(D+sqrt{n}) rounds and uses O~(min{m,n^{3/2}}) messages. We note that this improves over the singularly optimal algorithm in the KT_0 model that uses O~(D+sqrt{n}) rounds and O~(m) messages. 3) Minimum Cut. O(log n)-approximate minimum cut can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. 4) Graph Verification Problems such as Bipartiteness, Spanning Subgraph etc. These can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5].« less
  5. Matrix factorization (MF) approximates unobserved ratings in a rating matrix, whose rows correspond to users and columns correspond to items to be rated, and has been serving as a fundamental building block in recommendation systems. This paper comprehensively studies the problem of matrix factorization in different federated learning (FL) settings, where a set of parties want to cooperate in training but refuse to share data directly. We first propose a generic algorithmic framework for various settings of federated matrix factorization (FMF) and provide a theoretical convergence guarantee. We then systematically characterize privacy-leakage risks in data collection, training, and publishing stagesmore »for three different settings and introduce privacy notions to provide end-to-end privacy protections. The first one is vertical federated learning (VFL), where multiple parties have the ratings from the same set of users but on disjoint sets of items. The second one is horizontal federated learning (HFL), where parties have ratings from different sets of users but on the same set of items. The third setting is local federated learning (LFL), where the ratings of the users are only stored on their local devices. We introduce adapted versions of FMF with the privacy notions guaranteed in the three settings. In particular, a new private learning technique called embedding clipping is introduced and used in all the three settings to ensure differential privacy. For the LFL setting, we combine differential privacy with secure aggregation to protect the communication between user devices and the server with a strength similar to the local differential privacy model, but much better accuracy. We perform experiments to demonstrate the effectiveness of our approaches.« less