Simulating physical systems is a core component of scientific computing, encompassing a wide range of physical domains and applications. Recently, there has been a surge in data-driven methods to complement traditional numerical simulations methods, motivated by the opportunity to reduce computational costs and/or learn new physical models leveraging access to large collections of data. However, the diversity of problem settings and applications has led to a plethora of approaches, each one evaluated on a different setup and with different evaluation metrics. We introduce a set of benchmark problems to take a step towards unified benchmarks and evaluation protocols. We propose four representative physical systems, as well as a collection of both widely used classical time integrators and representative data-driven methods (kernel-based, MLP, CNN, nearest neighbors). Our framework allows evaluating objectively and systematically the stability, accuracy, and computational efficiency of data-driven methods. Additionally, it is configurable to permit adjustments for accommodating other learning tasks and for establishing a foundation for future developments in machine learning for scientific computing.
more »
« less
An Extensible Benchmark Suite for Learning to Simulate Physical Systems
Simulating physical systems is a core component of scientific computing, encompassing a wide range of physical domains and applications. Recently, there has been a surge in data-driven methods to complement traditional numerical simulations methods, motivated by the opportunity to reduce computational costs and/or learn new physical models leveraging access to large collections of data. However, the diversity of problem settings and applications has led to a plethora of approaches, each one evaluated on a different setup and with different evaluation metrics. We introduce a set of benchmark problems to take a step towards unified benchmarks and evaluation protocols. We propose four representative physical systems, as well as a collection of both widely used classical time integrators and representative data-driven methods (kernel-based, MLP, CNN, Nearest-Neighbors). Our framework allows to evaluate objectively and systematically the stability, accuracy, and computational efficiency of data-driven methods. Additionally, it is configurable to permit adjustments for accommodating other learning tasks and for establishing a foundation for future developments in machine learning for scientific computing.
more »
« less
- Award ID(s):
- 1901091
- PAR ID:
- 10276750
- Date Published:
- Journal Name:
- International Conference on Learning Representations, physical simulation workshop
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf ™ is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications, driven by the MLCommons ™ Association. We present the results from the first submission round including a diverse set of some of the world’s largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization and communication scheduling enabling overall >10× (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system’s memory hierarchy and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch-sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O and network behaviour to parameterize extended roofline performance models in future rounds.more » « less
-
null (Ed.)Deep learning has been shown as a successful method for various tasks, and its popularity results in numerous open-source deep learning software tools. Deep learning has been applied to a broad spectrum of scientific domains such as cosmology, particle physics, computer vision, fusion, and astrophysics. Scientists have performed a great deal of work to optimize the computational performance of deep learning frameworks. However, the same cannot be said for I/O performance. As deep learning algorithms rely on big-data volume and variety to effectively train neural networks accurately, I/O is a significant bottleneck on large-scale distributed deep learning training. This study aims to provide a detailed investigation of the I/O behavior of various scientific deep learning workloads running on the Theta supercomputer at Argonne Leadership Computing Facility. In this paper, we present DLIO, a novel representative benchmark suite built based on the I/O profiling of the selected workloads. DLIO can be utilized to accurately emulate the I/O behavior of modern scientific deep learning applications. Using DLIO, application developers and system software solution architects can identify potential I/O bottlenecks in their applications and guide optimizations to boost the I/O performance leading to lower training times by up to 6.7x.more » « less
-
The demand for high-performance computing resources has led to a paradigm shift towards massive parallelism using graphics processing units (GPUs) in many scientific disciplines, including machine learning, robotics, quantum chemistry, molecular dynamics, and computational fluid dynamics. In earthquake engineering, artificial intelligence and data-driven methods have gained increasing attention for leveraging GPU-computing for seismic analysis and evaluation for structures and regions. However, in finite-element analysis (FEA) applications for civil structures, the progress in GPU-accelerated simulations has been slower due to the unique challenges of porting structural dynamic analysis to the GPU, including the reliance on different element formulations, nonlinearities, coupled equations of motion, implicit integration schemes, and direct solvers. This research discusses these challenges and potential solutions to fully accelerate the dynamic analysis of civil structural problems. To demonstrate the feasibility of a fully GPU-accelerated FEA framework, a pilot GPU-based program was built for linear-elastic dynamic analyses. In the proposed implementation, the assembly, solver, and response update tasks of FEA were ported to the GPU, while the central-processing unit (CPU) instructed the GPU on how to perform the corresponding computations and off-loaded the simulated response upon completion of the analysis. Since GPU computing is massively parallel, the GPU platform can operate simultaneously on each node and element in the model at once. As a result, finer mesh discretization in FEA will not significantly increase run time on the GPU for the assembly and response update stages. Work remains to refine the program for nonlinear dynamic analysis.more » « less
-
Early research on physical human–robot interaction (pHRI) has necessarily focused on device design—the creation of compliant and sensorized hardware, such as exoskeletons, prostheses, and robot arms, that enables people to safely come in contact with robotic systems and to communicate about their collaborative intent. As hardware capabilities have become sufficient for many applications, and as computing has become more powerful, algorithms that support fluent and expressive use of pHRI systems have begun to play a prominent role in determining the systems’ usefulness. In this review, we describe a selection of representative algorithmic approaches that regulate and interpret pHRI, describing the progression from algorithms based on physical analogies, such as admittance control, to computational methods based on higher-level reasoning, which take advantage of multimodal communication channels. Existing algorithmic approaches largely enable task-specific pHRI, but they do not generalize to versatile human–robot collaboration. Throughout the review and in our discussion of next steps, we therefore argue that emergent embodied dialogue—bidirectional, multimodal communication that can be learned through continuous interaction—is one of the next frontiers of pHRI.more » « less
An official website of the United States government

