MetaGater: Fast Learning of Conditional Channel Gated Networks via Federated Meta-Learning
There has recently been an increasing interest in computationally-efficient learning methods for resource-constrained applications, e.g., pruning, quantization and channel gating. In this work, we advocate a holistic approach to jointly train the backbone network and the channel gating which can speed up subnet selection for a new task at the resource-limited node. In particular, we develop a federated meta-learning algorithm to jointly train good meta-initializations for both the backbone networks and gating modules, by leveraging the model similarity across learning tasks on different nodes. In this way, the learnt meta-gating module effectively captures the important filters of a good meta-backbone network, and a task-specific conditional channel gated network can be quickly adapted from the meta-initializations using data samples of the new task. The convergence of the proposed federated meta-learning algorithm is established under mild conditions. Experimental results corroborate the effectiveness of our method in comparison to related work.
Authors:
; ; ; ;
Award ID(s):
Publication Date:
NSF-PAR ID:
10352357
Journal Name:
2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (MASS)
Page Range or eLocation-ID:
164 to 172
2. In order to meet the requirements for safety and latency in many IoT applications, intelligent decisions must be made right here right now at the network edge, calling for edge intelligence. To facilitate fast edge learning, this work advocates a platform-aided federated meta-learning architecture, where a set of edge nodes joint force to learn a meta-model (i.e., model initialization for adaptation in a new learning task) by exploiting the similarity among edge nodes as well as the cloud knowledge transfer. The federated meta-learning problem is cast as a regularized stochastic optimization problem, using Bregman Divergence between the edge model and the cloud pre-trained model as the regularization. We then devise an alternating direction method of multiplier (ADMM) based Hessian-free federated meta-learning algorithm, called ADMM-FedMeta, with inexact Hessian estimation. Further, we analyze the convergence properties and the rapid adaptation performance of ADMM-FedMeta for the general non-convex case. The theoretical results show that under mild conditions, ADMM-FedMeta converges to an $\epsilon$-approximate first-order stationary point after at most $\mathcal{O}(1/\epsilon^2)$ communication rounds. Extensive experimental studies on benchmark datasets demonstrate the effectiveness and efficiency of ADMM-FedMeta, and showcase that ADMM-FedMeta outperforms the existing baselines.