GLOBALLY APPROXIMATE GAUSSIAN PROCESSES FOR BIG DATA WITH AN APPLICATION TO DATA-DRIVEN METAMATERIALS DESIGN

Bostanabad, Ramin

We introduce a novel method to enable Gaussian process (GP) modeling of massive datasets, called globally approximate Gaussian process (GAGP). Unlike most largescale supervised learners such as neural networks and trees, GAGP is easy to fit and can interpret the model behavior, making it particularly useful in engineering design with big data. The key idea of GAGP is to build an ensemble of independent GPs that distribute the entire training dataset among themselves and use the same hyperparameters. This is based on the observation that the GP hyperparameter estimates negligibly change as the size of the training data exceeds a certain level that can be estimated in a systematic way. For inference, the predictions from all GPs in the ensemble are pooled which allows to efficiently exploit the entire training dataset for prediction. Through analytical examples, we demonstrate that GAGP achieves very high predictive power that matches (and in some cases exceeds) that of state-of-the-art machine learning methods. We illustrate the application of GAGP in engineering design with a problem on data-driven metamaterials design where it is used to link reduced-dimension geometrical descriptors of unit cells and their properties. Searching for new unit cell designs with desired properties is then achieved by employing GAGP in inverse optimization.

More Like this