skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, September 13 until 2:00 AM ET on Saturday, September 14 due to maintenance. We apologize for the inconvenience.


Title: Inference-Optimized AI and High Performance Computing for Gravitational Wave Detection at Scale
We introduce an ensemble of artificial intelligence models for gravitational wave detection that we trained in the Summit supercomputer using 32 nodes, equivalent to 192 NVIDIA V100 GPUs, within 2 h. Once fully trained, we optimized these models for accelerated inference using NVIDIA TensorRT . We deployed our inference-optimized AI ensemble in the ThetaGPU supercomputer at Argonne Leadership Computer Facility to conduct distributed inference. Using the entire ThetaGPU supercomputer, consisting of 20 nodes each of which has 8 NVIDIA A100 Tensor Core GPUs and 2 AMD Rome CPUs, our NVIDIA TensorRT -optimized AI ensemble processed an entire month of advanced LIGO data (including Hanford and Livingston data streams) within 50 s. Our inference-optimized AI ensemble retains the same sensitivity of traditional AI models, namely, it identifies all known binary black hole mergers previously identified in this advanced LIGO dataset and reports no misclassifications, while also providing a 3 X inference speedup compared to traditional artificial intelligence models. We used time slides to quantify the performance of our AI ensemble to process up to 5 years worth of advanced LIGO data. In this synthetically enhanced dataset, our AI ensemble reports an average of one misclassification for every month of searched advanced LIGO data. We also present the receiver operating characteristic curve of our AI ensemble using this 5 year long advanced LIGO dataset. This approach provides the required tools to conduct accelerated, AI-driven gravitational wave detection at scale.  more » « less
Award ID(s):
1934757 1931561
NSF-PAR ID:
10351783
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Frontiers in Artificial Intelligence
Volume:
5
ISSN:
2624-8212
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes(,|m|)={(2,2),(2,1),(3,3),(3,2),(4,4)}, and mode mixing effects in the=3,|m|=2harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 h using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 h. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.

     
    more » « less
  2. Deep learning models are increasingly used for end-user applications, supporting both novel features such as facial recognition, and traditional features, e.g. web search. To accommodate high inference throughput, it is common to host a single pre-trained Convolutional Neural Network (CNN) in dedicated cloud-based servers with hardware accelerators such as Graphics Processing Units (GPUs). However, GPUs can be orders of magnitude more expensive than traditional Central Processing Unit (CPU) servers. These resources could also be under-utilized facing dynamic workloads, which may result in inflated serving costs. One potential way to alleviate this problem is by allowing hosted models to share the underlying resources, which we refer to as multi-tenant inference serving. One of the key challenges is maximizing the resource efficiency for multi-tenant serving given hardware with diverse characteristics, models with unique response time Service Level Agreement (SLA), and dynamic inference workloads. In this paper, we present PERSEUS, a measurement framework that provides the basis for understanding the performance and cost trade-offs of multi-tenant model serving. We implemented PERSEUS in Python atop a popular cloud inference server called Nvidia TensorRT Inference Server. Leveraging PERSEUS, we evaluated the inference throughput and cost for serving various models and demonstrated that multi-tenant model serving led to up to 12% cost reduction. 
    more » « less
  3. Abstract

    We introduce an end-to-end computational framework that allows for hyperparameter optimization using theDeepHyperlibrary, accelerated model training, and interpretable AI inference. The framework is based on state-of-the-art AI models includingCGCNN,PhysNet,SchNet,MPNN,MPNN-transformer, andTorchMD-NET. We employ these AI models along with the benchmarkQM9,hMOF, andMD17datasets to showcase how the models can predict user-specified material properties within modern computing environments. We demonstrate transferable applications in the modeling of small molecules, inorganic crystals and nanoporous metal organic frameworks with a unified, standalone framework. We have deployed and tested this framework in the ThetaGPU supercomputer at the Argonne Leadership Computing Facility, and in the Delta supercomputer at the National Center for Supercomputing Applications to provide researchers with modern tools to conduct accelerated AI-driven discovery in leadership-class computing environments. We release these digital assets as open source scientific software in GitLab, and ready-to-use Jupyter notebooks in Google Colab.

     
    more » « less
  4. Abstract

    A concise and measurable set of FAIR (Findable, Accessible, Interoperable and Reusable) principles for scientific data is transforming the state-of-practice for data management and stewardship, supporting and enabling discovery and innovation. Learning from this initiative, and acknowledging the impact of artificial intelligence (AI) in the practice of science and engineering, we introduce a set of practical, concise, and measurable FAIR principles for AI models. We showcase how to create and share FAIR data and AI models within a unified computational framework combining the following elements: the Advanced Photon Source at Argonne National Laboratory, the Materials Data Facility, the Data and Learning Hub for Science, and funcX, and the Argonne Leadership Computing Facility (ALCF), in particular the ThetaGPU supercomputer and the SambaNova DataScale®system at the ALCF AI Testbed. We describe how this domain-agnostic computational framework may be harnessed to enable autonomous AI-driven discovery.

     
    more » « less
  5. Artificial intelligence (AI) has immense potential spanning research and industry. AI applications abound and are expanding rapidly, yet the methods, performance, and understanding of AI are in their infancy. Researchers face vexing issues such as how to improve performance, transferability, reliability, comprehensibility, and how better to train AI models with only limited data. Future progress depends on advances in hardware accelerators, software frameworks, system and architectures, and creating cross-cutting expertise between scientific and AI domains. Open Compass is an exploratory research project to conduct academic pilot studies on an advanced engineering testbed for artificial intelligence, the Compass Lab, culminating in the development and publication of best practices for the benefit of the broad scientific community. Open Compass includes the development of an ontology to describe the complex range of existing and emerging AI hardware technologies and the identification of benchmark problems that represent different challenges in training deep learning models. These benchmarks are then used to execute experiments in alternative advanced hardware solution architectures. Here we present the methodology of Open Compass and some preliminary results on analyzing the effects of different GPU types, memory, and topologies for popular deep learning models applicable to image processing. 
    more » « less