NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-Synchronous Federated Learning

https://doi.org/10.3390/electronics13234585

Yu, Liangkun; Sun, Xiang; Albelaihi, Rana; Park, Chaeeun; Shao, Sihua (December 2024, Electronics)

Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy. FL algorithms fall into two primary categories: synchronous and asynchronous. While synchronous FL efficiently handles straggler devices, its convergence speed and model accuracy can be compromised. In contrast, asynchronous FL allows all devices to participate but incurs high communication overhead and potential model staleness. To overcome these limitations, the paper introduces a semi-synchronous FL framework that uses client tiering based on computing and communication latencies. Clients in different tiers upload their local models at distinct frequencies, striking a balance between straggler mitigation and communication costs. Building on this, the paper proposes the Dynamic client clustering, bandwidth allocation, and local training for semi-synchronous Federated learning (DecantFed) algorithm to dynamically optimize client clustering, bandwidth allocation, and local training workloads in order to maximize data sample processing rates in FL. DecantFed dynamically optimizes client clustering, bandwidth allocation, and local training workloads for maximizing data processing rates in FL. It also adapts client learning rates according to their tiers, thus addressing the model staleness issue. Extensive simulations using benchmark datasets like MNIST and CIFAR-10, under both IID and non-IID scenarios, demonstrate DecantFed’s superior performance. It outperforms FedAvg and FedProx in convergence speed and delivers at least a 28% improvement in model accuracy, compared to FedProx.
more » « less
Free, publicly-accessible full text available December 1, 2025
Latency-Aware Semi-Synchronous Client Selection and Model Aggregation for Wireless Federated Learning

https://doi.org/10.3390/fi15110352

Yu, Liangkun; Sun, Xiang; Albelaihi, Rana; Yi, Chen (November 2023, Future Internet)

Federated learning (FL) is a collaborative machine-learning (ML) framework particularly suited for ML models requiring numerous training samples, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Random Forest, in the context of various applications, e.g., next-word prediction and eHealth. FL involves various clients participating in the training process by uploading their local models to an FL server in each global iteration. The server aggregates these models to update a global model. The traditional FL process may encounter bottlenecks, known as the straggler problem, where slower clients delay the overall training time. This paper introduces the Latency-awarE Semi-synchronous client Selection and mOdel aggregation for federated learNing (LESSON) method. LESSON allows clients to participate at different frequencies: faster clients contribute more frequently, therefore mitigating the straggler problem and expediting convergence. Moreover, LESSON provides a tunable trade-off between model accuracy and convergence rate by setting varying deadlines. Simulation results show that LESSON outperforms two baseline methods, namely FedAvg and FedCS, in terms of convergence speed and maintains higher model accuracy compared to FedCS.
more » « less
Full Text Available
Backhaul-Aware Drone Base Station Placement and Resource Management for FSO-Based Drone-Assisted Mobile Networks

https://doi.org/10.1109/tnse.2022.3233004

Yu, Liangkun; Sun, Xiang; Shao, Sihua; Chen, Yougan; Albelaihi, Rana (May 2023, IEEE Transactions on Network Science and Engineering)

Full Text Available
Deep Reinforcement Learning Assisted Client Selection in Non-orthogonal Multiple Access based Federated Learning

https://doi.org/10.1109/JIOT.2023.3264463

Albelaihi, Rana; Alasandagutti, Akhil; Yu, Liangkun; Yao, Jingjing; Sun, Xiang (April 2023, IEEE Internet of Things Journal)

Full Text Available
Backhaul-aware Drone Base Station Placement and Resource Management for FSO-based Drone-assisted Mobile Networks

Yu, Liangkun; Sun, Xiang; Shao, Sihua; Chen, Yougan; Albelaihi, Rana (January 2023, IEEE transactions on network science and engineering)

Full Text Available
Green Federated Learning via Energy-Aware Client Selection

https://doi.org/10.1109/GLOBECOM48099.2022.10001569

Albelaihi, Rana; Yu, Liangkun; Craft, Warren D.; Sun, Xiang; Wang, Chonggang; Gazda, Robert (December 2022, GLOBECOM 2022 - 2022 IEEE Global Communications Conference)

Full Text Available
Adaptive Participant Selection in Heterogeneous Federated Learning

https://doi.org/10.1109/GLOBECOM46510.2021.9685077

Albelaihi, Rana; Sun, Xiang; Craft, Warren D.; Yu, Liangkun; Wang, Chonggang (December 2021, 2021 IEEE Global Communications Conference (GLOBECOM))

Federated learning (FL) is a distributed machine learning technique to address the data privacy issue. Participant selection is critical to determine the latency of the training process in a heterogeneous FL architecture, where users with different hardware setups and wireless channel conditions communicate with their base station to participate in the FL training process. Many solutions have been designed to consider computational and uploading latency of different users to select suitable participants such that the straggler problem can be avoided. However, none of these solutions consider the waiting time of a participant, which refers to the latency of a participant waiting for the wireless channel to be available, and the waiting time could significantly affect the latency of the training process, especially when a huge number of participants are involved in the training process and share the wireless channel in the time-division duplexing manner to upload their local FL models. In this paper, we consider not only the computational and uploading latency but also the waiting time (which is estimated based on an M/G/1 queueing model) of a participant to select suitable participants. We formulate an optimization problem to maximize the number of selected participants, who can upload their local models before the deadline in a global iteration. The Latency awarE pARticipant selectioN (LEARN) algorithm is proposed to solve the problem and the performance of LEARN is validated via simulations.
more » « less
Full Text Available

Search for: All records