- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources2
- Resource Type
-
10010
- Availability
-
20
- Author / Contributor
- Filter by Author / Creator
-
-
Tsaris, Aristeidis (2)
-
Balewski, Jan (1)
-
Balma, Jacob (1)
-
Cumming, Ben (1)
-
Danjo, Takumi (1)
-
Domke, Jens (1)
-
Drescher, Lukas (1)
-
Drozd, Aleksandr (1)
-
Duarte, Javier (1)
-
Emani, Murali (1)
-
Farrell, Steven (1)
-
Fink, Andreas (1)
-
Fox, Geoffrey (1)
-
Fukai, Takaaki (1)
-
Fukumoto, Naoto (1)
-
Fukushi, Tatsuya (1)
-
Gerofi, Balazs (1)
-
Harris, Philip (1)
-
Hauck, Scott (1)
-
Holzman, Burt (1)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
- (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf ™ is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications, driven by the MLCommons ™ Association. We present the results from the first submission round including a diverse set of some of the world’s largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization and communication scheduling enabling overall >10× (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system’s memory hierarchy and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch-sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O and network behaviour to parameterize extended roofline performance models in future rounds.more » « less
-
Duarte, Javier ; Harris, Philip ; Hauck, Scott ; Holzman, Burt ; Hsu, Shih-Chieh ; Jindariani, Sergo ; Khan, Suffian ; Kreis, Benjamin ; Lee, Brian ; Liu, Mia ; et al ( , Computing and Software for Big Science)