skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on September 9, 2025

Title: Efficient federated learning for distributed neuroimaging data
Recent advancements in neuroimaging have led to greater data sharing among the scientific community. However, institutions frequently maintain control over their data, citing concerns related to research culture, privacy, and accountability. This creates a demand for innovative tools capable of analyzing amalgamated datasets without the need to transfer actual data between entities. To address this challenge, we propose a decentralized sparse federated learning (FL) strategy. This approach emphasizes local training of sparse models to facilitate efficient communication within such frameworks. By capitalizing on model sparsity and selectively sharing parameters between client sites during the training phase, our method significantly lowers communication overheads. This advantage becomes increasingly pronounced when dealing with larger models and accommodating the diverse resource capabilities of various sites. We demonstrate the effectiveness of our approach through the application to the Adolescent Brain Cognitive Development (ABCD) dataset.  more » « less
Award ID(s):
2112455
PAR ID:
10569591
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Frontiers
Date Published:
Journal Name:
Frontiers in Neuroinformatics
Volume:
18
ISSN:
1662-5196
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Fully decentralized model training for on-road vehicles can leverage crowdsourced data while not depending on central servers, infrastructure or Internet coverage. However, under unreliable wireless communication and short contact duration, model sharing among peer vehicles may suffer severe losses thus fail frequently. To address these challenges, we propose “RoADTrain”, a route-assisted decentralized peer model training approach that carefully chooses vehicles with high chances of successful model sharing. It bounds the per round communication time yet retains model performance under vehicle mobility and unreliable communication. Based on shared route information, a connected cluster of vehicles can estimate and embed the link reliability and contact duration information into the communication topology. We decompose the topology into subgraphs supporting parallel communication, and identify a subset of them with the highest algebraic connectivity that can maximize the speed of the information flow in the cluster with high model sharing successes, thus accelerating model training in the cluster. We conduct extensive evaluation on driving decision making models using the popular CARLA simulator. RoADTrain achieves comparable driving success rates and 1.2−4.5× faster convergence than representative decentralized learning methods that always succeed in model sharing (e.g., SGP), and significantly outperforms other benchmarks that consider losses by 17−27% in the hardest driving conditions. These demonstrate that route sharing enables shrewd selection of vehicles for model sharing, thus better model performance and faster convergence against wireless losses and mobility. 
    more » « less
  2. Modern machine learning frameworks can train neural networks using multiple nodes in parallel, each computing parameter updates with stochastic gradient descent (SGD) and sharing them asynchronously through a central parameter server. Due to communication overhead and bottlenecks, the total throughput of SGD updates in a cluster scales sublinearly, saturating as the number of nodes increases. In this paper, we present a solution to predicting training throughput from profiling traces collected from a single-node configuration. Our approach is able to model the interaction of multiple nodes and the scheduling of concurrent transmissions between the parameter server and each node. By accounting for the dependencies between received parts and pending computations, we predict overlaps between computation and communication and generate synthetic execution traces for configurations with multiple nodes. We validate our approach on TensorFlow training jobs for popular image classification neural networks, on AWS and on our in-house cluster, using nodes equipped with GPUs or only with CPUs. We also investigate the effects of data transmission policies used in TensorFlow and the accuracy of our approach when combined with optimizations of the transmission schedule. 
    more » « less
  3. Abstract When aiming to make predictions over targets in the pharmacological setting, a data‐focused approach aims to learn models based on a collection of labeled examples. Unfortunately, data sharing is not always possible, and this can result in many different models trained on disparate populations, leading to the natural question of how best to use and combine them when making a new prediction. Previous work has focused on global model selection or ensembling, with the result of a single final model across the feature space. Machine‐learning models perform notoriously poorly on data outside their training domain, however, due to a problem known as covariate shift, and so we argue that when ensembling models the weightings for individual instances must reflect their respective domains—in other words, models that are more likely to have seen information on that instance should have more attention paid to them. We introduce a method for such an instance‐wise ensembling of models called Synthetic Model Combination (SMC), including a novel representation learning step for handling sparse high‐dimensional domains. We demonstrate the use of SMC on an example with dosing predictions for vancomycin, although emphasize the applicability of the method to any scenario involving the use of multiple models. 
    more » « less
  4. As a promising distributed machine learning paradigm, Federated Learning (FL) has attracted increasing attention to deal with data silo problems without compromising user privacy. By adopting the classic one-to-multi training scheme (i.e., FedAvg), where the cloud server dispatches one single global model to multiple involved clients, conventional FL methods can achieve collaborative model training without data sharing. However, since only one global model cannot always accommodate all the incompatible convergence directions of local models, existing FL approaches greatly suffer from inferior classification accuracy. To address this issue, we present an efficient FL framework named FedCross, which uses a novel multi-to-multi FL training scheme based on our proposed multi-model cross-aggregation approach. Unlike traditional FL methods, in each round of FL training, FedCross uses multiple middleware models to conduct weighted fusion individually. Since the middleware models used by FedCross can quickly converge into the same flat valley in terms of loss landscapes, the generated global model can achieve a well-generalization. Experimental results on various well-known datasets show that, compared with state-of-the-art FL methods, Fed Cross can significantly improve FL accuracy within both IID and non-IID scenarios without causing additional communication overhead. 
    more » « less
  5. Abstract This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semicontinuous responses. In this paper, we present a methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum‐of‐tree models. Our simulation results demonstrate that sharing of information across related model components is often very beneficial, particularly in sparse high‐dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive regression trees—a heteroskedastic log‐normal hurdle model with a “shrink‐toward‐homoskedasticity” prior and a gamma hurdle model. 
    more » « less