skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Federated Generative Model on Multi-Source Heterogeneous Data in IoT
The study of generative models is a promising branch of deep learning techniques, which has been successfully applied to different scenarios, such as Artificial Intelligence and the Internet of Things. While in most of the existing works, the generative models are realized as a centralized structure, raising the threats of security and privacy and the overburden of communication costs. Rare efforts have been committed to investigating distributed generative models, especially when the training data comes from multiple heterogeneous sources under realistic IoT settings. In this paper, to handle this challenging problem, we design a federated generative model framework that can learn a powerful generator for the hierarchical IoT systems. Particularly, our generative model framework can solve the problem of distributed data generation on multi-source heterogeneous data in two scenarios, i.e., feature related scenario and label related scenario. In addition, in our federated generative models, we develop a synchronous and an asynchronous updating methods to satisfy different application requirements. Extensive experiments on a simulated dataset and multiple real datasets are conducted to evaluate the data generation performance of our proposed generative models through comparison with the state-of-the-arts.  more » « less
Award ID(s):
2011845
PAR ID:
10525205
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
Proceedings of the AAAI Conference on Artificial Intelligence
Volume:
37
Issue:
9
ISSN:
2159-5399
Page Range / eLocation ID:
10537 to 10545
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Federated learning is a novel paradigm allowing the training of a global machine-learning model on distributed devices. It shares model parameters instead of private raw data during the entire model training process. While federated learning enables machine learning processes to take place collaboratively on Internet of Things (IoT) devices, compared to data centers, IoT devices with limited resource budgets typically have less security protection and are more vulnerable to potential thermal stress. Current research on the evaluation of federated learning is mainly based on the simulation of multi-clients/processes on a single machine/device. However, there is a gap in understanding the performance of federated learning under thermal stress in real-world distributed low-power heterogeneous IoT devices. Our previous work was among the first to evaluate the performance of federated learning under thermal stress on real-world IoT-based distributed systems. In this paper, we extended our work to a larger scale of heterogeneous real-world IoT-based distributed systems to further evaluate the performance of federated learning under thermal stress. To the best of our knowledge, the presented work is among the first to evaluate the performance of federated learning under thermal stress on real-world heterogeneous IoT-based systems. We conducted comprehensive experiments using the MNIST dataset and various performance metrics, including training time, CPU and GPU utilization rate, temperature, and power consumption. We varied the proportion of clients under thermal stress in each group of experiments and systematically quantified the effectiveness and real-world impact of thermal stress on the low-end heterogeneous IoT-based federated learning system. We added 67% more training epochs and 50% more clients compared with our previous work. The experimental results demonstrate that thermal stress is still effective on IoT-based federated learning systems as the entire global model and device performance degrade when even a small ratio of IoT devices are being impacted. Experimental results have also shown that the more influenced client under thermal stress within the federated learning system (FLS) tends to have a more major impact on the performance of FLS under thermal stress. 
    more » « less
  2. Abstract The use of IoT devices has significantly increased in recent years, but there have been growing concerns about the security and privacy issues associated with these IoT devices. A recent trend is to use deep network models to classify attack and benign traffic. A traditional approach is to train the models using centrally stored data collected from all the devices in the network. However, this framework raises concerns around data privacy and security. Attacks on the central server can compromise the data and expose sensitive information. To address the issues of data privacy and security, federated learning is now a widely studied solution in the research community. In this paper, we explore and implement federated learning techniques to detect attack traffic in the IoT network. We use Deep Neural Networks on the labeled dataset and Autoencoder on the unlabeled dataset in a federated framework. We implement different model aggregation algorithms such as FedSGD, FedAvg, and FedProx for federated learning. We compare the performance of these federated learning models with the models in a centralized framework and study which aggregation algorithm for the global model yields the best performance for detecting attack traffic in the IoT network. 
    more » « less
  3. Federated Learning (FL) has emerged as an effective paradigm for distributed learning systems owing to its strong potential in exploiting underlying data characteristics while preserving data privacy. In cases of practical data heterogeneity among FL clients in many Internet-of-Things (IoT) applications over wireless networks, however, existing FL frameworks still face challenges in capturing the overall feature properties of local client data that often exhibit disparate distributions. One approach is to apply generative adversarial networks (GANs) in FL to address data heterogeneity by integrating GANs to regenerate anonymous training data without exposing original client data to possible eavesdropping. Despite some successes, existing GAN-based FL frameworks still incur high communication costs and elicit other privacy concerns, limiting their practical applications. To this end, this work proposes a novel FL framework that only applies partial GAN model sharing. This new PS-FedGAN framework effectively addresses heterogeneous data distributions across clients and strengthens privacy preservation at reduced communication costs, especially over wireless networks. Our analysis demonstrates the convergence and privacy benefits of the proposed PS-FEdGAN framework. Through experimental results based on several well-known benchmark datasets, our proposed PS-FedGAN demonstrates strong potential to tackle FL under heterogeneous (non-IID) client data distributions, while improving data privacy and lowering communication overhead. 
    more » « less
  4. Camps-Valls, Gustau; Ruiz, Francisco J.; Valera, Isabel (Ed.)
    Linear contextual bandit is a popular online learning problem. It has been mostly studied in centralized learning settings. With the surging demand of large-scale decentralized model learning, e.g., federated learning, how to retain regret minimization while reducing communication cost becomes an open challenge. In this paper, we study linear contextual bandit in a federated learning setting. We propose a general framework with asynchronous model update and communication for a collection of homogeneous clients and heterogeneous clients, respectively. Rigorous theoretical analysis is provided about the regret and communication cost under this distributed learning framework; and extensive empirical evaluations demonstrate the effectiveness of our solution. 
    more » « less
  5. Aggregating person-level data across multiple clinical study sites is often constrained by privacy regulations, necessitating the development of decentralized modeling approaches in biomedical research. To address this requirement, a federated nonlinear regression algorithm based on the Choquet integral has been introduced for outcome prediction. This approach avoids reliance on prior statistical assumptions about data distribution and captures feature interactions, reflecting the non-additive nature of biomedical data characteristics. This work represents the first theoretical application of Choquet integral regression to multisite longitudinal trial data within a federated learning framework. The Multiple Imputation Choquet Integral Regression with LASSO (MIChoquet-LASSO) algorithm is specifically designed to reduce overfitting and enable variable selection in federated learning settings. Its performance has been evaluated using synthetic datasets, publicly available biomedical datasets, and proprietary longitudinal randomized controlled trial data. Comparative evaluations were conducted against benchmark methods, including ordinary least squares (OLS) regression and Choquet-OLS regression, under various scenarios such as model misspecification and both linear and nonlinear data structures in non-federated and federated contexts. Mean squared error was used as the primary performance metric. Results indicate that MIChoquet-LASSO outperforms compared models in handling nonlinear longitudinal data with missing values, particularly in scenarios prone to overfitting. In federated settings, Choquet-OLS underperforms, whereas the federated variant of the model, FEDMIChoquet-LASSO, demonstrates consistently better performance. These findings suggest that FEDMIChoquet-LASSO offers a reliable solution for outcome prediction in multisite longitudinal trials, addressing challenges such as missing values, nonlinear relationships, and privacy constraints while maintaining strong performance within the federated learning framework. 
    more » « less