skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ADMM-Based Hessian-Free Federated Meta-Learning for Regularized Edge Learning
In order to meet the requirements for safety and latency in many IoT applications, intelligent decisions must be made right here right now at the network edge, calling for edge intelligence. To facilitate fast edge learning, this work advocates a platform-aided federated meta-learning architecture, where a set of edge nodes joint force to learn a meta-model (i.e., model initialization for adaptation in a new learning task) by exploiting the similarity among edge nodes as well as the cloud knowledge transfer. The federated meta-learning problem is cast as a regularized stochastic optimization problem, using Bregman Divergence between the edge model and the cloud pre-trained model as the regularization. We then devise an alternating direction method of multiplier (ADMM) based Hessian-free federated meta-learning algorithm, called ADMM-FedMeta, with inexact Hessian estimation. Further, we analyze the convergence properties and the rapid adaptation performance of ADMM-FedMeta for the general non-convex case. The theoretical results show that under mild conditions, ADMM-FedMeta converges to an $$\epsilon$$-approximate first-order stationary point after at most $$\mathcal{O}(1/\epsilon^2)$$ communication rounds. Extensive experimental studies on benchmark datasets demonstrate the effectiveness and efficiency of ADMM-FedMeta, and showcase that ADMM-FedMeta outperforms the existing baselines.  more » « less
Award ID(s):
2203238
PAR ID:
10352180
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
MobiHoc '21: Proceedings of the Twenty-second International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing
Page Range / eLocation ID:
91-100
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Oh, A; Naumann, T; Globerson, A; Saenko, K; Hardt, M; Levine, S (Ed.)
    Federated reinforcement learning (FedRL) enables agents to collaboratively train a global policy without sharing their individual data. However, high communication overhead remains a critical bottleneck, particularly for natural policy gradient (NPG) methods, which are second-order. To address this issue, we propose the FedNPG-ADMM framework, which leverages the alternating direction method of multipliers (ADMM) to approximate global NPG directions efficiently. We theoretically demonstrate that using ADMM-based gradient updates reduces communication complexity from $O(d^2)$ to $O(d)$ at each iteration, where $$d$$ is the number of model parameters. Furthermore, we show that achieving an $$\epsilon$$-error stationary convergence requires $$O(\frac{1}{(1-\gamma)^2-\epsilon})$$ iterations for discount factor $$\gamma$$, demonstrating that FedNPG-ADMM maintains the same convergence rate as standard FedNPG. Through evaluation of the proposed algorithms in MuJoCo environments, we demonstrate that FedNPG-ADMM maintains the reward performance of standard FedNPG, and that its convergence rate improves when the number of federated agents increases. 
    more » « less
  2. As edge computing complements the cloud to enable computational services right at the network edge, federated learning (FL) can also benefit from close-by edge computing infrastructure. However, most prior works on federated edge learning (FEL) mainly focus on one shared global model during the federated training in edge systems. In a real edge computing scenario, there may co-exist multiple various FL models that are owned by different entities and used by different applications. Simultaneously training these models competes both computing and networking resources in the shared edge system. Therefore, in this work, we consider a multi-model federated edge learning where multiple FEL models are being trained in the edge network and edge servers can act as either parameter servers or workers of these FEL models. We formulate a joint participant selection and learning scheduling problem, which is a non-linear mixed-integer program, aiming to minimize the total cost of all FEL models while satisfying the desired convergence rate of trained FEL models and the constrained edge resources. We then design several algorithms by decoupling the original problem into two or three sub-problems which can be solved respectively and iteratively. Extensive simulations with real-world training datasets and FEL models show that our proposed algorithms can efficiently reduce the average total cost of all FEL models in a multi-model FEL setting compared with existing algorithms. 
    more » « less
  3. Federated learning (FL) has been emerging as a new distributed machine learning paradigm recently. Although FL can protect the data privacy of participants by keeping their training data on local devices, there are recent works raising new privacy concerns especially when workers or the parameter server of FL are untrustworthy or malicious. One effective way to solve the problem is using hierarchical federated learning (HFL) where a few middle-layer aggregators (or called group leaders) are used to aggregate local model updates from workers and send group model updates to the parameter server. In this paper, we consider the participant selection problem of HFL in an edge cloud with multiple FL models, where each model needs to select one parameter server, a few group leaders and a certain amount of workers from edge servers to jointly perform HFL. We first formulate this problem as a non-linear integer programming, aiming to minimize the total learning cost of all models while satisfying the constrained edge resources. We then design a three-stage algorithm by decoupling the original problem into three sub-problems and solving them iteratively. Simulations with real-world datasets and FL models confirm that our proposed algorithm can efficiently reduce the average total learning cost in edge cloud compared with existing methods. 
    more » « less
  4. There has recently been an increasing interest in computationally-efficient learning methods for resource-constrained applications, e.g., pruning, quantization and channel gating. In this work, we advocate a holistic approach to jointly train the backbone network and the channel gating which can speed up subnet selection for a new task at the resource-limited node. In particular, we develop a federated meta-learning algorithm to jointly train good meta-initializations for both the backbone networks and gating modules, by leveraging the model similarity across learning tasks on different nodes. In this way, the learnt meta-gating module effectively captures the important filters of a good meta-backbone network, and a task-specific conditional channel gated network can be quickly adapted from the meta-initializations using data samples of the new task. The convergence of the proposed federated meta-learning algorithm is established under mild conditions. Experimental results corroborate the effectiveness of our method in comparison to related work. 
    more » « less
  5. The growing demand of industrial, automotive and service robots presents a challenge to the centralized Cloud Robotics model in terms of privacy, security, latency, bandwidth, and reliability. In this paper, we present a ‘Fog Robotics’ approach to deep robot learning that distributes compute, storage and networking resources between the Cloud and the Edge in a federated manner. Deep models are trained on non-private (public) synthetic images in the Cloud; the models are adapted to the private real images of the environment at the Edge within a trusted network and subsequently, deployed as a service for low-latency and secure inference/prediction for other robots in the network. We apply this approach to surface decluttering, where a mobile robot picks and sorts objects from a cluttered floor by learning a deep object recognition and a grasp planning model. Experiments suggest that Fog Robotics can improve performance by sim-to-real domain adaptation in comparison to exclusively using Cloud or Edge resources, while reducing the inference cycle time by 4 to successfully declutter 86% of objects over 213 attempts. 
    more » « less