Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning. Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost. In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach. Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily. We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting. Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed. 
                        more » 
                        « less   
                    
                            
                            A bilevel optimization method for inverse mean-field games *
                        
                    
    
            Abstract In this paper, we introduce a bilevel optimization framework for addressing inverse mean-field games, alongside an exploration of numerical methods tailored for this bilevel problem. The primary benefit of our bilevel formulation lies in maintaining the convexity of the objective function and the linearity of constraints in the forward problem. Our paper focuses on inverse mean-field games characterized by unknown obstacles and metrics. We show numerical stability for these two types of inverse problems. More importantly, we, for the first time, establish the identifiability of the inverse mean-field game with unknown obstacles via the solution of the resultant bilevel problem. The bilevel approach enables us to employ an alternating gradient-based optimization algorithm with a provable convergence guarantee. To validate the effectiveness of our methods in solving the inverse problems, we have designed comprehensive numerical experiments, providing empirical evidence of its efficacy. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2401297
- PAR ID:
- 10577722
- Publisher / Repository:
- IOP Publisher
- Date Published:
- Journal Name:
- Inverse Problems
- Volume:
- 40
- Issue:
- 10
- ISSN:
- 0266-5611
- Page Range / eLocation ID:
- 105016
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract We introduce the concept of decision‐focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real‐time settings. The proposed data‐driven framework seeks to learn a simpler, for example, convex, surrogate optimization model that is trained to minimize thedecision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data‐driven inverse optimization problem to which we apply a decomposition‐based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision‐focused surrogate modeling with standard data‐driven surrogate modeling methods and demonstrate that our approach is significantly more data‐efficient while producing simple surrogate models with high decision prediction accuracy.more » « less
- 
            Abstract This paper concerns the inverse shape problem of recovering an unknown clamped cavity embedded in a thin infinite plate. The model problem is assumed to be governed by the two-dimensional biharmonic wave equation in the frequency domain. Based on the far-field data, a resolution analysis is conducted for cavity recovery via the direct sampling method. The Funk–Hecke integral identity is employed to analyze the performance of two imaging functions. Our analysis demonstrates that the same imaging functions commonly used for acoustic inverse shape problems are applicable to the biharmonic wave context. This work presents the first extension of direct sampling methods to biharmonic waves using far-field data. Numerical examples are provided to illustrate the effectiveness of these imaging functions in recovering a clamped cavity.more » « less
- 
            Abstract Many inverse problems are naturally formulated as a PDE-constrained optimization problem. These non-linear, large-scale, constrained optimization problems know many challenges, of which the inherent non-linearity of the problem is an important one. In this paper, we focus on a relaxed formulation of the PDE-constrained optimization problem and provide analysis for its properties including convexity under certain assumptions. Starting from an infinite-dimensional formulation of the inverse problem with discrete data, we propose a general framework for the analysis and discretisation of such problems. The relaxed formulation of the PDE-constrained optimization problem is shown to reduce to a weighted non-linear least-squares problem. The weight matrix turns out to be the Gram matrix of solutions of the PDE, and in some cases be estimated directly from the measurements. The latter observation points to a potential way to unify recently proposed data-driven reduced-order models for inverse problems with PDE-constrained optimization. We provide a number of representative case studies and numerical examples to illustrate our findings.more » « less
- 
            Bilevel optimization has recently attracted considerable attention due to its abundant applications in machine learning problems. However, existing methods rely on prior knowledge of problem parameters to determine stepsizes, resulting in significant effort in tuning stepsizes when these parameters are unknown. In this paper, we propose two novel tuning-free algorithms, D-TFBO and S-TFBO. D-TFBO employs a double-loop structure with stepsizes adaptively adjusted by the "inverse of cumulative gradient norms" strategy. S-TFBO features a simpler fully single-loop structure that updates three variables simultaneously with a theory-motivated joint design of adaptive stepsizes for all variables. We provide a comprehensive convergence analysis for both algorithms and show that D-TFBO and S-TFBO respectively require $$\mathcal{O}(\frac{1}{\epsilon})$$ and $$\mathcal{O}(\frac{1}{\epsilon}\log^4(\frac{1}{\epsilon}))$$ iterations to find an $$\epsilon$$-accurate stationary point, (nearly) matching their well-tuned counterparts using the information of problem parameters. Experiments on various problems show that our methods achieve performance comparable to existing well-tuned approaches, while being more robust to the selection of initial stepsizes. To the best of our knowledge, our methods are the first to completely eliminate the need for stepsize tuning, while achieving theoretical guarantees.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    