skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 1, 2026

Title: Mean Squared Error May Lead You Astray When Optimizing Your Inverse Design Methods
Abstract When performing time-intensive optimization tasks, such as those in topology or shape optimization, researchers have turned to machine-learned inverse design (ID) methods—i.e., predicting the optimized geometry from input conditions—to replace or warm start traditional optimizers. Such methods are often optimized to reduce the mean squared error (MSE) or binary cross entropy between the output and a training dataset of optimized designs. While convenient, we show that this choice may be myopic. Specifically, we compare two methods of optimizing the hyperparameters of easily reproducible machine learning models including random forest, k-nearest neighbors, and deconvolutional neural network model for predicting the three optimal topology problems. We show that under both direct inverse design and when warm starting further topology optimization, using MSE metrics to tune hyperparameters produces less performance models than directly evaluating the objective function, though both produce designs that are almost one order of magnitude better than using the common uniform initialization. We also illustrate how warm starting impacts both the convergence time, the type of solutions obtained during optimization, and the final designs. Overall, our initial results portend that researchers may need to revisit common choices for evaluating ID methods that subtly tradeoff factors in how an ID method will actually be used. We hope our open-source dataset and evaluation environment will spur additional research in those directions.  more » « less
Award ID(s):
1943699
PAR ID:
10651204
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
American Society of Mechanical Engineers
Date Published:
Journal Name:
Journal of Mechanical Design
Volume:
147
Issue:
2
ISSN:
1050-0472
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Design optimization, and particularly adjoint-based multi-physics shape and topology optimization, is time-consuming and often requires expensive iterations to converge to desired designs. In response, researchers have developed Machine Learning (ML) approaches — often referred to as Inverse Design methods — to either replace or accelerate tools like Topology optimization (TO). However, these methods have their own hidden, non-trivial costs including that of data generation, training, and refinement of ML-produced designs. This begs the question: when is it actually worth learning Inverse Design, compared to just optimizing designs without ML assistance? This paper quantitatively addresses this question by comparing the costs and benefits of three different Inverse Design ML model families on a Topology Optimization (TO) task, compared to just running the optimizer by itself. We explore the relationship between the size of training data and the predictive power of each ML model, as well as the computational and training costs of the models and the extent to which they accelerate or hinder TO convergence. The results demonstrate that simpler models, such as K-Nearest Neighbors and Random Forests, are more effective for TO warmstarting with limited training data, while more complex models, such as Deconvolutional Neural Networks, are preferable with more data. We also emphasize the need to balance the benefits of using larger training sets with the costs of data generation when selecting the appropriate ID model. Finally, the paper addresses some challenges that arise when using ML predictions to warmstart optimization, and provides some suggestions for budget and resource management. 
    more » « less
  2. Many data analysis and design problems involve reasoning about points in high-dimensional space. A common strategy is to embed points from this high-dimensional space into a low-dimensional one. As we will show in this paper, a critical property of good embeddings is that they preserve isometry — i.e., preserving the geodesic distance between points on the original data manifold within their embedded locations in the latent space. However, enforcing isometry is non-trivial for common Neural embedding models, such as autoencoders and generative models. Moreover, while theoretically appealing, it is not clear to what extent enforcing isometry is really necessary for a given design or analysis task. This paper answers these questions by constructing an isometric embedding via an isometric autoencoder, which we employ to analyze an inverse airfoil design problem. Specifically, the paper describes how to train an isometric autoencoder and demonstrates its usefulness compared to non-isometric autoencoders on both simple pedagogical examples and for airfoil embeddings using the UIUC airfoil dataset. Our ablation study illustrates that enforcing isometry is necessary to accurately discover latent space clusters — a common analysis method researchers typically perform on low-dimensional embeddings. We also show how isometric autoencoders can uncover pathologies in typical gradient-based Shape Optimization solvers through an analysis on the SU2-optimized airfoil dataset, wherein we find an over-reliance of the gradient solver on angle of attack. Overall, this paper motivates the use of isometry constraints in Neural embedding models, particularly in cases where researchers or designer intend to use distance-based analysis measures (such as clustering, k-Nearest Neighbors methods, etc.) to analyze designs within the latent space. While this work focuses on airfoil design as an illustrative example, it applies to any domain where analyzing isometric design or data embeddings would be useful. 
    more » « less
  3. Abstract In Topology Optimization (TO) and related engineering applications, physics-constrained simulations are often used to optimize candidate designs given some set of boundary conditions. However, such models are computationally expensive and do not guarantee convergence to a desired result, given the frequent non-convexity of the performance objective. Creating data-based approaches to warm-start these models — or even replace them entirely — has thus been a top priority for researchers in this area of engineering design. In this paper, we present a new dataset of two-dimensional heat sink designs optimized via Multiphysics Topology Optimization (MTO). Further, we propose an augmented Vector-Quantized GAN (VQGAN) that allows for effective MTO data compression within a discrete latent space, known as a codebook, while preserving high reconstruction quality. To concretely assess the benefits of the VQGAN quantization process, we conduct a latent analysis of its codebook as compared to the continuous latent space of a deep AutoEncoder (AE). We find that VQGAN can more effectively learn topological connections despite a high rate of data compression. Finally, we leverage the VQGAN codebook to train a small GPT-2 model, generating thermally performant heat sink designs within a fraction of the time taken by conventional optimization approaches. We show the transformer-based approach is more effective than using a Deep Convolutional GAN (DCGAN) due to its elimination of mode collapse issues, as well as better preservation of topological connections in MTO and similar applications. 
    more » « less
  4. Abstract In the rapidly developing field of nanophotonics, machine learning (ML) methods facilitate the multi‐parameter optimization processes and serve as a valuable technique in tackling inverse design challenges by predicting nanostructure designs that satisfy specific optical property criteria. However, while considerable efforts have been devoted to applying ML for designing the overall spectral response of photonic nanostructures, often without elucidating the underlying physical mechanisms, physics‐based models remain largely unexplored. Here, physics‐empowered forward and inverse ML models to design dielectric meta‐atoms with controlled multipolar responses are introduced. By utilizing the multipole expansion theory, the forward model efficiently predicts the scattering response of meta‐atoms with diverse shapes and the inverse model designs meta‐atoms that possess the desired multipole resonances. Implementing the inverse design model, uniquely shaped meta‐atoms with enhanced higher‐order magnetic resonances and those supporting a super‐scattering regime of light‐matter interactions resulting in nearly five‐fold enhancement of scattering beyond the single‐channel limit are designed. Finally, an ML model to predict the wavelength‐dependent electric field distribution inside and near the meta‐atom is developed. The proposed ML based models will likely facilitate uncovering new regimes of linear and nonlinear light‐matter interaction at the nanoscale as well as a versatile toolkit for nanophotonic design. 
    more » « less
  5. We introduce a novel method to enable Gaussian process (GP) modeling of massive datasets, called globally approximate Gaussian process (GAGP). Unlike most largescale supervised learners such as neural networks and trees, GAGP is easy to fit and can interpret the model behavior, making it particularly useful in engineering design with big data. The key idea of GAGP is to build an ensemble of independent GPs that distribute the entire training dataset among themselves and use the same hyperparameters. This is based on the observation that the GP hyperparameter estimates negligibly change as the size of the training data exceeds a certain level that can be estimated in a systematic way. For inference, the predictions from all GPs in the ensemble are pooled which allows to efficiently exploit the entire training dataset for prediction. Through analytical examples, we demonstrate that GAGP achieves very high predictive power that matches (and in some cases exceeds) that of state-of-the-art machine learning methods. We illustrate the application of GAGP in engineering design with a problem on data-driven metamaterials design where it is used to link reduced-dimension geometrical descriptors of unit cells and their properties. Searching for new unit cell designs with desired properties is then achieved by employing GAGP in inverse optimization. 
    more » « less