Standard federated optimization methods successfully apply to stochastic problems with singlelevel structure. However, many contemporary ML problems – including adversarial robustness, hyperparameter tuning, actor-critic – fall under nested bilevel programming that subsumes minimax and compositional optimization. In this work, we propose FEDNEST: A federated alternating stochastic gradient method to address general nested problems. We establish provable convergence rates for FEDNEST in the presence of heterogeneous data and introduce variations for bilevel, minimax, and compositional optimization. FEDNEST introduces multiple innovations including federated hypergradient computation and variance reduction to address inner-level heterogeneity. We complement our theory with experiments on hyperparameter & hyper-representation learning and minimax optimization that demonstrate the benefits of our method in practice. 
                        more » 
                        « less   
                    
                            
                            Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing
                        
                    
    
            Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investigate the problem of federated hyperparameter tuning. We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting. Then, by making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx, to accelerate federated hyperparameter tuning that is applicable to widely-used federated optimization methods such as FedAvg and recent variants. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization across devices. Empirically, we show that FedEx can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks, obtaining higher accuracy using the same training budget. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10311644
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Adversarial attacks pose significant challenges in many machine learning applications, particularly in the setting of distributed training and federated learning, where malicious agents seek to corrupt the training process with the goal of jeopardizing and compromising the performance and reliability of the final models. In this paper, we address the problem of robust federated learning in the presence of such attacks by formulating the training task as a bi-level optimization problem. We conduct a theoretical analysis of the resilience of consensus-based bi-level optimization (CB2O), an interacting multi-particle metaheuristic optimization method, in adversarial settings. Specifically, we provide a global convergence analysis of CB2O in mean-field law in the presence of malicious agents, demonstrating the robustness of CB2O against a diverse range of attacks. Thereby, we offer insights into how specific hyperparameter choices enable to mitigate adversarial effects. On the practical side, we extend CB2O to the clustered federated learning setting by proposing FedCB2O, a novel interacting multi-particle system, and design a practical algorithm that addresses the demands of real-world applications. Extensive experiments demonstrate the robustness of the FedCB2O algorithm against label-flipping attacks in decentralized clustered federated learning scenarios, showcasing its effectiveness in practical contexts. This article is part of the theme issue ‘Partial differential equations in data science’.more » « less
- 
            We consider the problem of predicting cellular network performance (signal maps) from measurements collected by several mobile devices. We formulate the problem within the online federated learning framework: (i) federated learning (FL) enables users to collaboratively train a model, while keeping their training data on their devices; (ii) measurements are collected as users move around over time and are used for local training in an online fashion. We consider an honest-but-curious server, who observes the updates from target users participating in FL and infers their location using a deep leakage from gradients (DLG) type of attack, originally developed to reconstruct training data of DNN image classifiers. We make the key observation that a DLG attack, applied to our setting, infers the average location of a batch of local data, and can thus be used to reconstruct the target users' trajectory at a coarse granularity. We build on this observation to protect location privacy, in our setting, by revisiting and designing mechanisms within the federated learning framework including: tuning the FL parameters for averaging, curating local batches so as to mislead the DLG attacker, and aggregating across multiple users with different trajectories. We evaluate the performance of our algorithms through both analysis and simulation based on real-world mobile datasets, and we show that they achieve a good privacy-utility tradeoff.more » « less
- 
            Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.more » « less
- 
            Federated bilevel optimization (FBO) has garnered significant attention lately, driven by its promising applications in meta-learning and hyperparameter optimization. Existing algorithms generally aim to approximate the gradient of the upper-level objective function (hypergradient) in the federated setting. However, because of the nonlinearity of the hypergradient and client drift, they often involve complicated computations. These computations, like multiple optimization sub-loops and second-order derivative evaluations, end up with significant memory consumption and high computational costs. In this paper, we propose a computationally and memory-efficient FBO algorithm named MemFBO. MemFBO features a fully single-loop structure with all involved variables updated simultaneously, and uses only first-order gradient information for all local updates. We show that MemFBO exhibits a linear convergence speedup with milder assumptions in both partial and full client participation scenarios. We further implement MemFBO in a novel FBO application for federated data cleaning. Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    