skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Controlling Federated Learning for Covertness
A learner aims to minimize a function f by repeatedly querying a distributed oracle that provides noisy gradient evaluations. At the same time, the learner seeks to hide arg min f from a malicious eavesdropper that observes the learner’s queries. This paper considers the problem of covert or learner-private optimization, where the learner has to dynamically choose between learning and obfuscation by exploiting the stochasticity. The problem of controlling the stochastic gradient algorithm for covert optimization is modeled as a Markov decision process, and we show that the dynamic programming operator has a supermodular structure implying that the optimal policy has a monotone threshold structure. A computationally efficient policy gradient algorithm is proposed to search for the optimal querying policy without knowledge of the transition probabilities. As a practical application, our methods are demonstrated on a hate speech classification task in a federated setting where an eavesdropper can use the optimal weights to generate toxic content, which is more easily misclassified. Numerical results show that when the learner uses the optimal policy, an eavesdropper can only achieve a validation accuracy of 52% with no information and 69% when it has a public dataset with 10% positive samples compared to 83% when the learner employs a greedy policy.  more » « less
Award ID(s):
2312198
PAR ID:
10518931
Author(s) / Creator(s):
;
Publisher / Repository:
https://www.jmlr.org/tmlr/papers/
Date Published:
Journal Name:
Transactions on machine learning research
ISSN:
2835-8856
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This letter studies how a stochastic gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper. Such prob- lems are of significant interest in distributed optimization settings like federated learning and inventory management. A learner queries a stochastic oracle and incentivizes the oracle to obtain noisy gradient measurements and per- form SG. The oracle probabilistically returns either a noisy gradient of the function or a non-informative measure- ment, depending on the oracle state and incentive. The learner’s query and incentive are visible to an eavesdropper who wishes to estimate the stationary point. This letter formulates the problem of the learner performing covert optimization by dynamically incentivizing the stochastic oracle and obfuscating the eavesdropper as a finite-horizon Markov decision process (MDP). Using conditions for interval-dominance on the cost and transition probability structure, we show that the optimal policy for the MDP has a monotone threshold structure. We propose searching for the optimal stationary policy with the threshold structure using a stochastic approximation algorithm and a multi– armed bandit approach. The effectiveness of our methods is numerically demonstrated on a covert federated learning hate-speech classification task. 
    more » « less
  2. null (Ed.)
    We consider the framework of non-stationary stochastic optimization (Besbes et al., 2015) with squared error losses and noisy gradient feedback where the dynamic regret of an online learner against a time varying comparator sequence is studied. Motivated from the theory of non-parametric regression, we introduce a new variational constraint that enforces the comparator sequence to belong to a discrete k^{th} order Total Variation ball of radius C_n. This variational constraint models comparators that have piece-wise polynomial structure which has many relevant practical applications (Tibshirani, 2014). By establishing connections to the theory of wavelet based non-parametric regression, we design a polynomial time algorithm that achieves the nearly optimal dynamic regret of ~O(n^{1/(2k+3)} C_n^{2/(2k+3)}). The proposed policy is adaptive to the unknown radius C_n. Further, we show that the same policy is minimax optimal for several other non-parametric families of interest. 
    more » « less
  3. Poupart, Pascal (Ed.)
    Incentive design, also known as model design or environment design for Markov decision processes(MDPs), refers to a class of problems in which a leader can incentivize his follower by modifying the follower's reward function, in anticipation that the follower's optimal policy in the resulting MDP can be desirable for the leader's objective. In this work, we propose gradient-ascent algorithms to compute the leader's optimal incentive design, despite the lack of knowledge about the follower's reward function. First, we formulate the incentive design problem as a bi-level optimization problem and demonstrate that, by the softmax temporal consistency between the follower's policy and value function, the bi-level optimization problem can be reduced to single-level optimization, for which a gradient-based algorithm can be developed to optimize the leader's objective. We establish several key properties of incentive design in MDPs and prove the convergence of the proposed gradient-based method. Next, we show that the gradient terms can be estimated from observations of the follower's best response policy, enabling the use of a stochastic gradient-ascent algorithm to compute a locally optimal incentive design without knowing or learning the follower's reward function. Finally, we analyze the conditions under which an incentive design remains optimal for two different rewards which are policy invariant. The effectiveness of the proposed algorithm is demonstrated using a small probabilistic transition system and a stochastic gridworld. 
    more » « less
  4. Over-parametrization is an important technique in training neural networks. In both theory and practice, training a larger network allows the optimization algorithm to avoid bad local optimal solutions. In this paper we study a closely related tensor decomposition problem: given an l-th order tensor in (Rd)⊗l of rank r (where r≪d), can variants of gradient descent find a rank m decomposition where m>r? We show that in a lazy training regime (similar to the NTK regime for neural networks) one needs at least m=Ω(dl−1), while a variant of gradient descent can find an approximate tensor when m=O∗(r2.5llogd). Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. 
    more » « less
  5. We consider the problem of personalizing audio to maximize user experience. Briefly, we aim to find a filter h*, which applied to any music or speech, will maximize the user’s satisfaction. This is a black-box optimization problem since the user’s satisfaction function is unknown. Substantive work has been done on this topic where the key idea is to play audio samples to the user, each shaped by a different filter hi, and query the user for their satisfaction scores f(hi). A family of “surrogate” functions is then designed to fit these scores and the optimization method gradually refines these functions to arrive at the filter ˆh* that maximizes satisfaction. In certain applications, we observe that a second type of querying is possible where users can tell us the individual elements h*[j] of the optimal filter h*. Consider an analogy from cooking where the goal is to cook a recipe that maximizes user satisfaction. A user can be asked to score various cooked recipes (e.g., tofu fried rice) or to score individual ingredients (say, salt, sugar, rice, chicken, etc.). Given a budget of B queries, where a query can be of either type, our goal is to find the recipe that will maximize this user’s satisfaction. Our proposal builds on Sparse Gaussian Process Regression (GPR) and shows how a hybrid approach can outperform any one type of querying. Our results are validated through simulations and real world experiments, where volunteers gave feedback on music/speech audio and were able to achieve high satisfaction levels. We believe this idea of hybrid querying opens new problems in black-box optimization and solutions can benefit other applications beyond audio personalization. 
    more » « less