Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning. Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost. In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach. Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily. We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting. Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed.
more »
« less
A bilevel optimization method for inverse mean-field games *
Abstract In this paper, we introduce a bilevel optimization framework for addressing inverse mean-field games, alongside an exploration of numerical methods tailored for this bilevel problem. The primary benefit of our bilevel formulation lies in maintaining the convexity of the objective function and the linearity of constraints in the forward problem. Our paper focuses on inverse mean-field games characterized by unknown obstacles and metrics. We show numerical stability for these two types of inverse problems. More importantly, we, for the first time, establish the identifiability of the inverse mean-field game with unknown obstacles via the solution of the resultant bilevel problem. The bilevel approach enables us to employ an alternating gradient-based optimization algorithm with a provable convergence guarantee. To validate the effectiveness of our methods in solving the inverse problems, we have designed comprehensive numerical experiments, providing empirical evidence of its efficacy.
more »
« less
- Award ID(s):
- 2401297
- PAR ID:
- 10577722
- Publisher / Repository:
- IOP Publisher
- Date Published:
- Journal Name:
- Inverse Problems
- Volume:
- 40
- Issue:
- 10
- ISSN:
- 0266-5611
- Page Range / eLocation ID:
- 105016
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We introduce the concept of decision‐focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real‐time settings. The proposed data‐driven framework seeks to learn a simpler, for example, convex, surrogate optimization model that is trained to minimize thedecision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data‐driven inverse optimization problem to which we apply a decomposition‐based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision‐focused surrogate modeling with standard data‐driven surrogate modeling methods and demonstrate that our approach is significantly more data‐efficient while producing simple surrogate models with high decision prediction accuracy.more » « less
-
Abstract This paper concerns the inverse shape problem of recovering an unknown clamped cavity embedded in a thin infinite plate. The model problem is assumed to be governed by the two-dimensional biharmonic wave equation in the frequency domain. Based on the far-field data, a resolution analysis is conducted for cavity recovery via the direct sampling method. The Funk–Hecke integral identity is employed to analyze the performance of two imaging functions. Our analysis demonstrates that the same imaging functions commonly used for acoustic inverse shape problems are applicable to the biharmonic wave context. This work presents the first extension of direct sampling methods to biharmonic waves using far-field data. Numerical examples are provided to illustrate the effectiveness of these imaging functions in recovering a clamped cavity.more » « less
-
Abstract Many inverse problems are naturally formulated as a PDE-constrained optimization problem. These non-linear, large-scale, constrained optimization problems know many challenges, of which the inherent non-linearity of the problem is an important one. In this paper, we focus on a relaxed formulation of the PDE-constrained optimization problem and provide analysis for its properties including convexity under certain assumptions. Starting from an infinite-dimensional formulation of the inverse problem with discrete data, we propose a general framework for the analysis and discretisation of such problems. The relaxed formulation of the PDE-constrained optimization problem is shown to reduce to a weighted non-linear least-squares problem. The weight matrix turns out to be the Gram matrix of solutions of the PDE, and in some cases be estimated directly from the measurements. The latter observation points to a potential way to unify recently proposed data-driven reduced-order models for inverse problems with PDE-constrained optimization. We provide a number of representative case studies and numerical examples to illustrate our findings.more » « less
-
Ruiz, Francisco; Dy, Jennifer; van de Meent, Jan-Willem (Ed.)In this paper, we study a class of bilevel optimization problems, also known as simple bilevel optimization, where we minimize a smooth objective function over the optimal solution set of another convex constrained optimization problem. Several iterative methods have been developed for tackling this class of problems. Alas, their convergence guarantees are either asymptotic for the upper-level objective, or the convergence rates are slow and sub-optimal. To address this issue, in this paper, we introduce a novel bilevel optimization method that locally approximates the solution set of the lower-level problem via a cutting plane and then runs a conditional gradient update to decrease the upper-level objective. When the upper-level objective is convex, we show that our method requires $${O}(\max\{1/\epsilon_f,1/\epsilon_g\})$$ iterations to find a solution that is $$\epsilon_f$$-optimal for the upper-level objective and $$\epsilon_g$$-optimal for the lower-level objective. Moreover, when the upper-level objective is non-convex, our method requires $${O}(\max\{1/\epsilon_f^2,1/(\epsilon_f\epsilon_g)\})$$ iterations to find an $$(\epsilon_f,\epsilon_g)$$-optimal solution. We also prove stronger convergence guarantees under the Holderian error bound assumption on the lower-level problem. To the best of our knowledge, our method achieves the best-known iteration complexity for the considered class of bilevel problems.more » « less
An official website of the United States government

