skip to main content

Title: Projection-Free Bandit Optimization with Privacy Guarantees
We design differentially private algorithms for the bandit convex optimization problem in the projection-free setting. This setting is important whenever the decision set has a complex geometry, and access to it is done efficiently only through a linear optimization oracle, hence Euclidean projections are unavailable (e.g. matroid polytope, submodular base polytope). This is the first differentially-private algorithm for projection-free bandit optimization, and in fact our bound matches the best known non-private projection-free algorithm (Garber-Kretzu, AISTATS '20) and the best known private algorithm, even for the weaker setting when projections are available (Smith-Thakurta, NeurIPS '13).
Authors:
; ;
Award ID(s):
1908510 1750333 1718342
Publication Date:
NSF-PAR ID:
10293359
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
35
Issue:
8
Page Range or eLocation-ID:
7322-7330
ISSN:
2374-3468
Sponsoring Org:
National Science Foundation
More Like this
  1. We design differentially private algorithms for the bandit convex optimization problem in the projection-free setting. This setting is important whenever the decision set has a complex geometry, and access to it is done efficiently only through a linear optimization oracle, hence Euclidean projections are unavailable (e.g. matroid polytope, submodular base polytope). This is the first differentially-private algorithm for projection-free bandit optimization, and in fact our bound matches the best known non-private projection-free algorithm and the best known private algorithm, even for the weaker setting when projections are available.
  2. We develop a novel and generic algorithm for the adversarial multi-armed bandit problem (or more generally the combinatorial semi-bandit problem). When instantiated differently, our algorithm achieves various new data-dependent regret bounds improving previous work. Examples include: 1) a regret bound depending on the variance of only the best arm; 2) a regret bound depending on the first-order path-length of only the best arm; 3) a regret bound depending on the sum of the first-order path-lengths of all arms as well as an important negative term, which together lead to faster convergence rates for some normal form games with partial feedback;more »4) a regret bound that simultaneously implies small regret when the best arm has small loss {\it and} logarithmic regret when there exists an arm whose expected loss is always smaller than those of other arms by a fixed gap (e.g. the classic i.i.d. setting). In some cases, such as the last two results, our algorithm is completely parameter-free. The main idea of our algorithm is to apply the optimism and adaptivity techniques to the well-known Online Mirror Descent framework with a special log-barrier regularizer. The challenges are to come up with appropriate optimistic predictions and correction terms in this framework. Some of our results also crucially rely on using a sophisticated increasing learning rate schedule.« less
  3. We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements, motivated by the federated learn- ing (FL) framework. We propose a distributed communication-efficient and local differentially private stochastic gradient descent (CLDP-SGD) algorithm and analyze its communication, privacy, and convergence trade-offs. Since each iteration of the CLDP- SGD aggregates the client-side local gradients, we develop (optimal) communication-efficient schemes for mean estimation for several lp spaces under local differential privacy (LDP). To overcome performance limitation of LDP, CLDP-SGD takes advantage of the inherent privacy amplification provided by client sub- sampling and data subsampling at each se-more »lected client (through SGD) as well as the recently developed shuffled model of privacy. For convex loss functions, we prove that the proposed CLDP-SGD algorithm matches the known lower bounds on the centralized private ERM while using a finite number of bits per iteration for each client, i.e., effectively get- ting communication efficiency for “free”. We also provide preliminary experimental results supporting the theory.« less
  4. Projection-free conditional gradient (CG) methods are the algorithms of choice for constrained optimization setups in which projections are often computationally prohibitive but linear optimization over the constraint set remains computationally feasible. Unlike in projection-based methods, globally accelerated convergence rates are in general unattainable for CG. However, a very recent work on Locally accelerated CG (LaCG) has demonstrated that local acceleration for CG is possible for many settings of interest. The main downside of LaCG is that it requires knowledge of the smoothness and strong convexity parameters of the objective function. We remove this limitation by introducing a novel, Parameter-Free Locallymore »accelerated CG (PF-LaCG) algorithm, for which we provide rigorous convergence guarantees. Our theoretical results are complemented by numerical experiments, which demonstrate local acceleration and showcase the practical improvements of PF-LaCG over non-accelerated algorithms, both in terms of iteration count and wall-clock time.« less
  5. In this paper, we study Federated Bandit, a decentralized Multi-Armed Bandit problem with a set of N agents, who can only communicate their local data with neighbors described by a connected graph G. Each agent makes a sequence of decisions on selecting an arm from M candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution withmore »which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm \textttGossip\_UCB, which is a coupling of variants of both the classical gossiping algorithm and the celebrated Upper Confidence Bound (UCB) bandit algorithm. We show that \textttGossip\_UCB successfully adapts local bandit learning into a global gossiping process for sharing information among connected agents, and achieves guaranteed regret at the order of O(\max\ \textttpoly (N,M) łog T, \textttpoly (N,M)łog_łambda_2^-1 N\ ) for all N agents, where łambda_2\in(0,1) is the second largest eigenvalue of the expected gossip matrix, which is a function of G. We then propose \textttFed\_UCB, a differentially private version of \textttGossip\_UCB, in which the agents preserve ε-differential privacy of their local data while achieving O(\max \\frac\textttpoly (N,M) ε łog^2.5 T, \textttpoly (N,M) (łog_łambda_2^-1 N + łog T) \ ) regret.« less