skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: CHESSFL: Clustering Hierarchical Embeddings for Semi-Supervised Federated Learning
Hierarchical Federated Learning (HFL) has shown great promise over the past few years, with significant improvements in communication efficiency and overall performance. However, current research for HFL predominantly centers on supervised learning. This focus becomes problematic when dealing with semi-supervised learning, particularly under non-IID scenarios. In order to address this gap, our paper critically assesses the performance of straightforward adaptations of current state-of-the-art semi-supervised FL (SSFL) techniques within the HFL framework. We also introduce a novel clustering mechanism for hierarchical embeddings to alleviate the challenges introduced by semi-supervised paradigms in a hierarchical setting. Our approach not only provides superior accuracy, but also converges up to 5.11× faster, while being robust to non-IID data distributions for multiple datasets with negligible communication overhead  more » « less
Award ID(s):
2107085
PAR ID:
10547271
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-7025-6
Page Range / eLocation ID:
122 to 133
Subject(s) / Keyword(s):
Hierarchical Federated Learning, Semi-Supervised Learning, Edge Devices, Data Heterogeneity, Data Privacy, Communication Efficiency, Internet-of-Things
Format(s):
Medium: X
Location:
Hong Kong, Hong Kong
Sponsoring Org:
National Science Foundation
More Like this
  1. The recent developments in Federated Learning (FL) focus on optimizing the learning process for data, hardware, and model heterogeneity. However, most approaches assume all devices are stationary, charging, and always connected to the Wi-Fi when training on local data. We argue that when real devices move around, the FL process is negatively impacted and the device energy spent for communication is increased. To mitigate such effects, we propose a dynamic community selection algorithm which improves the communication energy efficiency and two new aggregation strategies that boost the learning performance in Hierarchical FL (HFL). For real mobility traces, we show that compared to state-of-the-art HFL solutions, our approach is scalable, achieves better accuracy on multiple datasets, converges up to 3.88× faster, and is significantly more energy efficient for both IID and non-IID scenarios. 
    more » « less
  2. Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy. FL algorithms fall into two primary categories: synchronous and asynchronous. While synchronous FL efficiently handles straggler devices, its convergence speed and model accuracy can be compromised. In contrast, asynchronous FL allows all devices to participate but incurs high communication overhead and potential model staleness. To overcome these limitations, the paper introduces a semi-synchronous FL framework that uses client tiering based on computing and communication latencies. Clients in different tiers upload their local models at distinct frequencies, striking a balance between straggler mitigation and communication costs. Building on this, the paper proposes the Dynamic client clustering, bandwidth allocation, and local training for semi-synchronous Federated learning (DecantFed) algorithm to dynamically optimize client clustering, bandwidth allocation, and local training workloads in order to maximize data sample processing rates in FL. DecantFed dynamically optimizes client clustering, bandwidth allocation, and local training workloads for maximizing data processing rates in FL. It also adapts client learning rates according to their tiers, thus addressing the model staleness issue. Extensive simulations using benchmark datasets like MNIST and CIFAR-10, under both IID and non-IID scenarios, demonstrate DecantFed’s superior performance. It outperforms FedAvg and FedProx in convergence speed and delivers at least a 28% improvement in model accuracy, compared to FedProx. 
    more » « less
  3. null (Ed.)
    Smart grids integrate advanced information and communication technologies (ICTs) into traditional power grids for more efficient and resilient power delivery and management, but also introduce new security vulnerabilities that can be exploited by adversaries to launch cyber attacks, causing severe consequences such as massive blackout and infrastructure damages. Existing machine learning-based methods for detecting cyber attacks in smart grids are mostly based on supervised learning, which need the instances of both normal and attack events for training. In addition, supervised learning requires that the training dataset includes representative instances of various types of attack events to train a good model, which is sometimes hard if not impossible. This paper presents a new method for detecting cyber attacks in smart grids using PMU data, which is based on semi-supervised anomaly detection and deep representation learning. Semi-supervised anomaly detection only employs the instances of normal events to train detection models, making it suitable for finding unknown attack events. A number of popular semi-supervised anomaly detection algorithms were investigated in our study using publicly available power system cyber attack datasets to identify the best-performing ones. The performance comparison with popular supervised algorithms demonstrates that semi-supervised algorithms are more capable of finding attack events than supervised algorithms. Our results also show that the performance of semi-supervised anomaly detection algorithms can be further improved by augmenting with deep representation learning. 
    more » « less
  4. Federated learning (FL) has been emerging as a new distributed machine learning paradigm recently. Although FL can protect the data privacy of participants by keeping their training data on local devices, there are recent works raising new privacy concerns especially when workers or the parameter server of FL are untrustworthy or malicious. One effective way to solve the problem is using hierarchical federated learning (HFL) where a few middle-layer aggregators (or called group leaders) are used to aggregate local model updates from workers and send group model updates to the parameter server. In this paper, we consider the participant selection problem of HFL in an edge cloud with multiple FL models, where each model needs to select one parameter server, a few group leaders and a certain amount of workers from edge servers to jointly perform HFL. We first formulate this problem as a non-linear integer programming, aiming to minimize the total learning cost of all models while satisfying the constrained edge resources. We then design a three-stage algorithm by decoupling the original problem into three sub-problems and solving them iteratively. Simulations with real-world datasets and FL models confirm that our proposed algorithm can efficiently reduce the average total learning cost in edge cloud compared with existing methods. 
    more » « less
  5. Avidan, S. (Ed.)
    Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult. Recent studies focus on learning video-level temporal and discriminative information using contrastive learning, but overlook the hierarchical spatial-temporal nature of human skeletons. Different from such superficial supervision at the video level, we propose a self-supervised hierarchical pre-training scheme incorporated into a hierarchical Transformer-based skeleton sequence encoder (Hi-TRS), to explicitly capture spatial, short-term, and long-term temporal dependencies at frame, clip, and video levels, respectively. To evaluate the proposed self-supervised pre-training scheme with Hi-TRS, we conduct extensive experiments covering three skeleton-based downstream tasks including action recognition, action detection, and motion prediction. Under both supervised and semi-supervised evaluation protocols, our method achieves the state-of-the-art performance. Additionally, we demonstrate that the prior knowledge learned by our model in the pre-training stage has strong transfer capability for different downstream tasks. 
    more » « less