skip to main content


Title: METASET: An Automated Data Selection Method for Scalable Data-Driven Design of Metamaterials
Abstract

Data-driven design of mechanical metamaterials is an increasingly popular method to combat costly physical simulations and immense, often intractable, geometrical design spaces. Using a precomputed dataset of unit cells, a multiscale structure can be quickly filled via combinatorial search algorithms, and machine learning models can be trained to accelerate the process. However, the dependence on data induces a unique challenge: An imbalanced dataset containing more of certain shapes or physical properties than others can be detrimental to the efficacy of the approaches and any models built on those sets. In answer, we posit that a smaller yet diverse set of unit cells leads to scalable search and unbiased learning. To select such subsets, we propose METASET, a methodology that 1) uses similarity metrics and positive semi-definite kernels to jointly measure the closeness of unit cells in both shape and property space, and 2) incorporates Determinantal Point Processes for efficient subset selection. Moreover, METASET allows the trade-off between shape and property diversity so that subsets can be tuned for various applications. Through the design of 2D metamaterials with target displacement profiles, we demonstrate that smaller, diverse subsets can indeed improve the search process as well as structural performance. We also apply METASET to eliminate inherent overlaps in a dataset of 3D unit cells created with symmetry rules, distilling it down to the most unique families. Our diverse subsets are provided publicly for use by any designer.

 
more » « less
Award ID(s):
1835677
NSF-PAR ID:
10300878
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
International Design Engineering Technical Conferences and Computers and Information in Engineering Conferences
Volume:
11A
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Data-driven design of mechanical metamaterials is an increasingly popular method to combat costly physical simulations and immense, often intractable, geometrical design spaces. Using a precomputed dataset of unit cells, a multiscale structure can be quickly filled via combinatorial search algorithms, and machine learning models can be trained to accelerate the process. However, the dependence on data induces a unique challenge: an imbalanced dataset containing more of certain shapes or physical properties can be detrimental to the efficacy of data-driven approaches. In answer, we posit that a smaller yet diverse set of unit cells leads to scalable search and unbiased learning. To select such subsets, we propose METASET, a methodology that (1) uses similarity metrics and positive semi-definite kernels to jointly measure the closeness of unit cells in both shape and property spaces and (2) incorporates Determinantal Point Processes for efficient subset selection. Moreover, METASET allows the trade-off between shape and property diversity so that subsets can be tuned for various applications. Through the design of 2D metamaterials with target displacement profiles, we demonstrate that smaller, diverse subsets can indeed improve the search process as well as structural performance. By eliminating inherent overlaps in a dataset of 3D unit cells created with symmetry rules, we also illustrate that our flexible method can distill unique subsets regardless of the metric employed. Our diverse subsets are provided publicly for use by any designer. 
    more » « less
  2. Abstract

    Inspired by the recent achievements of machine learning in diverse domains, data-driven metamaterials design has emerged as a compelling paradigm that can unlock the potential of multiscale architectures. The model-centric research trend, however, lacks principled frameworks dedicated to data acquisition, whose quality propagates into the downstream tasks. Often built by naive space-filling design in shape descriptor space, metamaterial datasets suffer from property distributions that are either highly imbalanced or at odds with design tasks of interest. To this end, we present t-METASET: an active learning-based data acquisition framework aiming to guide both diverse and task-aware data generation. Distinctly, we seek a solution to a commonplace yet frequently overlooked scenario at early stages of data-driven design of metamaterials: when a massive (∼O(104)) shape-only library has been prepared with no properties evaluated. The key idea is to harness a data-driven shape descriptor learned from generative models, fit a sparse regressor as a start-up agent, and leverage metrics related to diversity to drive data acquisition to areas that help designers fulfill design goals. We validate the proposed framework in three deployment cases, which encompass general use, task-specific use, and tailorable use. Two large-scale mechanical metamaterial datasets are used to demonstrate the efficacy. Applicable to general image-based design representations, t-METASET could boost future advancements in data-driven design.

     
    more » « less
  3. We introduce a novel method to enable Gaussian process (GP) modeling of massive datasets, called globally approximate Gaussian process (GAGP). Unlike most largescale supervised learners such as neural networks and trees, GAGP is easy to fit and can interpret the model behavior, making it particularly useful in engineering design with big data. The key idea of GAGP is to build an ensemble of independent GPs that distribute the entire training dataset among themselves and use the same hyperparameters. This is based on the observation that the GP hyperparameter estimates negligibly change as the size of the training data exceeds a certain level that can be estimated in a systematic way. For inference, the predictions from all GPs in the ensemble are pooled which allows to efficiently exploit the entire training dataset for prediction. Through analytical examples, we demonstrate that GAGP achieves very high predictive power that matches (and in some cases exceeds) that of state-of-the-art machine learning methods. We illustrate the application of GAGP in engineering design with a problem on data-driven metamaterials design where it is used to link reduced-dimension geometrical descriptors of unit cells and their properties. Searching for new unit cell designs with desired properties is then achieved by employing GAGP in inverse optimization. 
    more » « less
  4. Abstract We introduce a novel method for Gaussian process (GP) modeling of massive datasets called globally approximate Gaussian process (GAGP). Unlike most large-scale supervised learners such as neural networks and trees, GAGP is easy to fit and can interpret the model behavior, making it particularly useful in engineering design with big data. The key idea of GAGP is to build a collection of independent GPs that use the same hyperparameters but randomly distribute the entire training dataset among themselves. This is based on our observation that the GP hyperparameter approximations change negligibly as the size of the training data exceeds a certain level, which can be estimated systematically. For inference, the predictions from all GPs in the collection are pooled, allowing the entire training dataset to be efficiently exploited for prediction. Through analytical examples, we demonstrate that GAGP achieves very high predictive power matching (and in some cases exceeding) that of state-of-the-art supervised learning methods. We illustrate the application of GAGP in engineering design with a problem on data-driven metamaterials, using it to link reduced-dimension geometrical descriptors of unit cells and their properties. Searching for new unit cell designs with desired properties is then achieved by employing GAGP in inverse optimization. 
    more » « less
  5. Abstract

    Inspired by the recent achievements of machine learning in diverse domains, data-driven metamaterials design has emerged as a compelling paradigm that can unlock the potential of the multiscale architectures. The model-centric research trend, however, lacks principled frameworks dedicated to data acquisition, whose quality propagates into the downstream tasks. Built by naive space-filling design in shape descriptor space, metamaterial datasets suffer from property distributions that are either highly imbalanced or at odds with design tasks of interest. To this end, we present t-METASET: an active-learning-based data acquisition framework aiming to guide both balanced and task-aware data generation. Uniquely, we seek a solution to a commonplace yet frequently overlooked scenario at early stages of data-driven design: when a massive shape-only library has been prepared with no properties evaluated. The key idea is to harness a data-driven shape descriptor learned from generative models, fit a sparse regressor as a start-up agent, and leverage metrics related to diversity to drive data acquisition to areas that help designers fulfill design goals. We validate the proposed framework in three deployment cases, which encompass general use, task-specific use, and tailorable use. Two large-scale mechanical metamaterial datasets (∼ O(104)) are used to demonstrate the efficacy. Applicable to general design representations, t-METASET can boost future advancements in data-driven design.

     
    more » « less