Search for: All records

Award ID contains: 2004541

« Prev Next »

Total Resources

8

Resource Type
Conference Paper

6

Conference Proceeding

0

Dataset

0

Journal Article

2

Workshop Report

0

Availability
Full Text / Resource Available

8

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Accelerating restarted GMRES with mixed precision arithmetic

Lindquist, N. ; Luszczek, P. ; and Dongarra, J. ( June 2021 , IEEE transactions on parallel and distributed systems)

The generalized minimum residual method (GMRES) is a commonly used iterative Krylov solver for sparse, non-symmetric systems of linear equations. Like other iterative solvers, data movement dominates its run time. To improve this performance, we propose running GMRES in reduced precision with key operations remaining in full precision. Additionally, we provide theoretical results linking the convergence of finite precision GMRES with classical Gram-Schmidt with reorthogonalization (CGSR) and its infinite precision counterpart which helps justify the convergence of this method to double-precision accuracy. We tested the mixed-precision approach with a variety of matrices and preconditioners on a GPU-accelerated node. Excluding the incomplete LU factorization without fill in (ILU(0)) preconditioner, we achieved average speedups ranging from 8 to 61 percent relative to comparable double-precision implementations, with the simpler preconditioners achieving the higher speedups.
more » « less
Full Text Available
Task-graph scheduling extensions for efficient synchronization and communication.

Bak, S. ; Hernandez, O. ; Gates, M. ; Luszczek, P. ; Sarkar, V. ( June 2021 , Proceedings of the ACM International Conference on Supercomputing.)

Task graphs have been studied for decades as a foundation for scheduling irregular parallel applications and incorporated in many programming models including OpenMP. While many high-performance parallel libraries are based on task graphs, they also have additional scheduling requirements, such as synchronization within inner levels of data parallelism and internal blocking communications. In this paper, we extend task-graph scheduling to support efficient synchronization and communication within tasks. Compared to past work, our scheduler avoids deadlock and oversubscription of worker threads, and refines victim selection to increase the overlap of sibling tasks. To the best of our knowledge, our approach is the first to combine gang-scheduling and work-stealing in a single runtime. Our approach has been evaluated on the SLATE high-performance linear algebra library. Relative to the LLVM OMP runtime, our runtime demonstrates performance improvements of up to 13.82%, 15.2%, and 36.94% for LU, QR, and Cholesky, respectively, evaluated across different configurations related to matrix size, number of nodes, and use of CPUs vs GPUs
more » « less
Full Text Available
Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques

Lindquist, N. ; Luszczek, P. ; Dongarra, J. ( November 2020 , 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA))
null (Ed.)
Full Text Available
Translational Process: Mathematical Software Perspective

Dongarra, J. ; Gates, M. ; Luszczek, P. ; Tomov, S. ( September 2020 , Journal of computational science)
null (Ed.)
Full Text Available
Scalable Data Generation for Evaluating Mixed-Precision Solvers

Luszczek, P. ; Tsai, Y. ; Lindquist, N. ; Anzt, H. ; Dongarra, J. ( September 2020 , 2020 IEEE High Performance Extreme Computing Conference (HPEC))
null (Ed.)
Full Text Available
Improving the Performance of the GMRES Method using Mixed-Precision Techniques

Lindquist, N. ; Luszczek, P. ; Dongarra, J. ( August 2020 , Smoky Mountains Computational Sciences & Engineering Conference (SMC2020))
null (Ed.)
Full Text Available
Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices using GPUs

Abdelfattah, A. ; Tomov, S. ; Dongarra, J. ( June 2020 , International Conference on Computational Science (ICCS 2020))
null (Ed.)
Full Text Available
Docker Container based PaaS Cloud Computing Comprehensive Benchmarks using LAPACK

Zaitsev, D. ; Luszczek, P. ( March 2020 , Computer Modeling and Intelligent Systems CMIS-2020)
null (Ed.)
Full Text Available