Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Many applications, e.g., federated learning, require the aggregation of information across a large number of distributed nodes. In this paper, we explore efficient schemes to do this at scale leveraging aggregation at intermediate nodes across overlay trees. Files/updates are split into chunks which are in turn simultaneously aggregated over different trees. For a synchronous setting with homogeneous communications capabilities and deterministic link delays, we develop a delay optimal aggregation schedule. In the asynchronous setting, where delays are stochastic but i.i.d., across links, we show that for an asynchronous implementation of the above schedule, the expected aggregation delay is near-optimal. We then consider the impact that failures in the network have on the resulting Mean Square Error (MSE) for the estimated aggregates and how it can be controlled through the addition of redundancy, reattempts, and optimal failure-aware estimates for the desired aggregate. Based on the analysis of a natural model of failures, we show how to choose parameters to optimize the trade-off between aggregation delay and MSE. We present simulation results exhibiting the above mentioned tradeoffs. We also consider a more general class of correlated failures and demonstrate via simulation the applicability of our techniques in those settings as well.more » « less
-
The recent developments in Federated Learning (FL) focus on optimizing the learning process for data, hardware, and model heterogeneity. However, most approaches assume all devices are stationary, charging, and always connected to the Wi-Fi when training on local data. We argue that when real devices move around, the FL process is negatively impacted and the device energy spent for communication is increased. To mitigate such effects, we propose a dynamic community selection algorithm which improves the communication energy efficiency and two new aggregation strategies that boost the learning performance in Hierarchical FL (HFL). For real mobility traces, we show that compared to state-of-the-art HFL solutions, our approach is scalable, achieves better accuracy on multiple datasets, converges up to 3.88× faster, and is significantly more energy efficient for both IID and non-IID scenarios.more » « less
An official website of the United States government

Full Text Available